ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1510.08484
  4. Cited By
MUSAN: A Music, Speech, and Noise Corpus

MUSAN: A Music, Speech, and Noise Corpus

28 October 2015
David Snyder
Guoguo Chen
Daniel Povey
ArXiv (abs)PDFHTML

Papers citing "MUSAN: A Music, Speech, and Noise Corpus"

50 / 664 papers shown
Title
Probabilistic Fusion and Calibration of Neural Speaker Diarization Models
Probabilistic Fusion and Calibration of Neural Speaker Diarization Models
Juan Ignacio Alvarez-Trejos
Sérgio A. Balanya
D. Ramos
Alicia Lozano-Diez
UQCV
138
0
0
27 Nov 2025
Continual Audio Deepfake Detection via Universal Adversarial Perturbation
Continual Audio Deepfake Detection via Universal Adversarial PerturbationAsia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2025
W. Li
Lin Li
Q. Hong
AAML
248
0
0
25 Nov 2025
Robust Neural Audio Fingerprinting using Music Foundation Models
Robust Neural Audio Fingerprinting using Music Foundation Models
Shubhr Singh
Kiran Bhat
Xavier Riley
Benjamin Resnick
John Thickstun
Walter De Brouwer
125
0
0
07 Nov 2025
MERaLiON-SER: Robust Speech Emotion Recognition Model for English and SEA Languages
MERaLiON-SER: Robust Speech Emotion Recognition Model for English and SEA Languages
Hardik B. Sailor
Aw Ai Ti
Chen Fang Yih Nancy
Chiu Ying Lay
Ding Yang
...
Wong Heng Meng Jeremy
Wu Jinyang
Zhang Huayun
Zhang Longyin
Zou Xunlong
AuLLM
416
0
0
07 Nov 2025
CantoASR: Prosody-Aware ASR-LALM Collaboration for Low-Resource Cantonese
CantoASR: Prosody-Aware ASR-LALM Collaboration for Low-Resource Cantonese
Dazhong Chen
Yi-Cheng Lin
Yuchen Huang
Ziwei Gong
Di Jiang
Zeying Xie
Yi R.
Fung
100
0
0
06 Nov 2025
Open Source State-Of-the-Art Solution for Romanian Speech Recognition
Open Source State-Of-the-Art Solution for Romanian Speech Recognition
Gabriel Pirlogeanu
Alexandru-Lucian Georgescu
Horia Cucu
92
0
0
05 Nov 2025
ADNAC: Audio Denoiser using Neural Audio Codec
ADNAC: Audio Denoiser using Neural Audio Codec
Daniel Jimon
Mircea Vaida
Adriana Stan
88
0
0
03 Nov 2025
UniTok-Audio: A Unified Audio Generation Framework via Generative Modeling on Discrete Codec Tokens
UniTok-Audio: A Unified Audio Generation Framework via Generative Modeling on Discrete Codec Tokens
Chengwei Liu
Haoyin Yan
Shaofei Xue
Xiaotao Liang
Yinghao Liu
Zheng Xue
Gang Song
Boyang Zhou
231
2
0
30 Oct 2025
Ming-UniAudio: Speech LLM for Joint Understanding, Generation and Editing with Unified Representation
Ming-UniAudio: Speech LLM for Joint Understanding, Generation and Editing with Unified Representation
C. Yan
Chunxiang Jin
Dawei Huang
Haibing Yu
Han Peng
...
Yongjie Lyu
Z. He
Zhihao Qiu
Zhiqiang Fang
Ziyuan Huang
AuLLM
393
4
0
26 Oct 2025
UniSE: A Unified Framework for Decoder-only Autoregressive LM-based Speech Enhancement
UniSE: A Unified Framework for Decoder-only Autoregressive LM-based Speech Enhancement
Haoyin Yan
Chengwei Liu
Shaofei Xue
Xiaotao Liang
Zheng Xue
138
3
0
23 Oct 2025
Re-evaluating Minimum Bayes Risk Decoding for Automatic Speech Recognition
Re-evaluating Minimum Bayes Risk Decoding for Automatic Speech Recognition
Yuu Jinnai
112
0
0
22 Oct 2025
A Stage-Wise Learning Strategy with Fixed Anchors for Robust Speaker Verification
A Stage-Wise Learning Strategy with Fixed Anchors for Robust Speaker Verification
Bin Gu
Lipeng Dai
Huipeng Du
Haitao Zhao
Jibo Wei
72
0
0
21 Oct 2025
Noise-Conditioned Mixture-of-Experts Framework for Robust Speaker Verification
Noise-Conditioned Mixture-of-Experts Framework for Robust Speaker Verification
Bin Gu
Lipeng Dai
Huipeng Du
Haitao Zhao
Jibo Wei
AAMLMoE
162
0
0
21 Oct 2025
Adaptive Per-Channel Energy Normalization Front-end for Robust Audio Signal Processing
Adaptive Per-Channel Energy Normalization Front-end for Robust Audio Signal Processing
Hanyu Meng
V. Sethu
E. Ambikairajah
Qiquan Zhang
Haizhou Li
88
0
0
21 Oct 2025
Transformer Redesign for Late Fusion of Audio-Text Features on Ultra-Low-Power Edge Hardware
Transformer Redesign for Late Fusion of Audio-Text Features on Ultra-Low-Power Edge Hardware
Stavros Mitsis
Ermos Hadjikyriakos
Humaid Ibrahim
Savvas Neofytou
Shashwat Raman
James Myles
Eiman Kanjo
90
0
0
20 Oct 2025
Two Heads Are Better Than One: Audio-Visual Speech Error Correction with Dual Hypotheses
Two Heads Are Better Than One: Audio-Visual Speech Error Correction with Dual Hypotheses
S. Kim
Kangwook Jang
Sungwoo Cho
Joon Son Chung
Hoirin Kim
Se-Young Yun
101
0
0
15 Oct 2025
HyWA: Hypernetwork Weight Adapting Personalized Voice Activity Detection
HyWA: Hypernetwork Weight Adapting Personalized Voice Activity Detection
Mahsa Ghazvini Nejad
Hamed Jafarzadeh Asl
Amin Edraki
Mohammadreza Sadeghi
M. Asgharian
Yuanhao Yu
Vahid Partovi Nia
118
0
0
14 Oct 2025
Enhancing Speaker Verification with w2v-BERT 2.0 and Knowledge Distillation guided Structured Pruning
Enhancing Speaker Verification with w2v-BERT 2.0 and Knowledge Distillation guided Structured Pruning
Ze Li
Ming Cheng
Ming Li
VLM
124
0
0
05 Oct 2025
High-Fidelity Speech Enhancement via Discrete Audio Tokens
High-Fidelity Speech Enhancement via Discrete Audio Tokens
Luca A. Lanzendörfer
Frédéric Berdoz
Antonis Asonitis
Roger Wattenhofer
92
0
0
02 Oct 2025
RealClass: A Framework for Classroom Speech Simulation with Public Datasets and Game Engines
RealClass: A Framework for Classroom Speech Simulation with Public Datasets and Game Engines
Ahmed Adel Attia
Jing Liu
Carol Espy-Wilson
156
0
0
01 Oct 2025
UniFlow-Audio: Unified Flow Matching for Audio Generation from Omni-Modalities
UniFlow-Audio: Unified Flow Matching for Audio Generation from Omni-Modalities
Xuenan Xu
Jiahao Mei
Zihao Zheng
Ye Tao
Zeyu Xie
...
Yuning Wu
Ming Yan
Wen Wu
Chao Zhang
Mengyue Wu
VGen
119
3
0
29 Sep 2025
Generalizable Speech Deepfake Detection via Information Bottleneck Enhanced Adversarial Alignment
Generalizable Speech Deepfake Detection via Information Bottleneck Enhanced Adversarial Alignment
Pu Huang
Shouguang Wang
Siya Yao
Mengchu Zhou
128
0
0
28 Sep 2025
Addressing Gradient Misalignment in Data-Augmented Training for Robust Speech Deepfake Detection
Addressing Gradient Misalignment in Data-Augmented Training for Robust Speech Deepfake Detection
Duc-Tuan Truong
Tianchi Liu
Junjie Li
Ruijie Tao
Kong Aik Lee
Eng Siong Chng
105
0
0
25 Sep 2025
How Does Instrumental Music Help SingFake Detection?
How Does Instrumental Music Help SingFake Detection?
Xuanjun Chen
Chia-Yu Hu
I-Ming Lin
Yi-Cheng Lin
I-Hsiang Chiu
...
Sung-Feng Huang
Yi-Hsuan Yang
Haibin Wu
Hung-yi Lee
Jyh-Shing Roger Jang
107
0
0
18 Sep 2025
Canary-1B-v2 & Parakeet-TDT-0.6B-v3: Efficient and High-Performance Models for Multilingual ASR and AST
Canary-1B-v2 & Parakeet-TDT-0.6B-v3: Efficient and High-Performance Models for Multilingual ASR and AST
Monica Sekoyan
Nithin Rao Koluguri
Nune Tadevosyan
Piotr .Zelasko
Travis M. Bartley
Nick Karpov
Jagadeesh Balam
Boris Ginsburg
VLM
162
4
0
17 Sep 2025
FireRedChat: A Pluggable, Full-Duplex Voice Interaction System with Cascaded and Semi-Cascaded Implementations
FireRedChat: A Pluggable, Full-Duplex Voice Interaction System with Cascaded and Semi-Cascaded Implementations
Junjie Chen
Yao Hu
Junjie Li
K. Li
Kun Liu
...
Manzhen Wei
Yichen Wu
Fenglong Xie
K. Xu
Kun Xie
187
3
0
08 Sep 2025
Xi+: Uncertainty Supervision for Robust Speaker Embedding
Xi+: Uncertainty Supervision for Robust Speaker Embedding
Junjie Li
Kong Aik Lee
Duc-Tuan Truong
Tianchi Liu
Man-Wai Mak
204
0
0
07 Sep 2025
Enhancing Self-Supervised Speaker Verification Using Similarity-Connected Graphs and GCN
Enhancing Self-Supervised Speaker Verification Using Similarity-Connected Graphs and GCN
Zhaorui Sun
Yihao Chen
Jialong Wang
Minqiang Xu
Lei Fang
Sian Fang
Lin Liu
SSL
144
1
0
04 Sep 2025
Denoising GER: A Noise-Robust Generative Error Correction with LLM for Speech Recognition
Denoising GER: A Noise-Robust Generative Error Correction with LLM for Speech Recognition
Yanyan Liu
Minqiang Xu
Yihao Chen
Liang He
Lei Fang
Sian Fang
Lin Liu
VLM
107
2
0
04 Sep 2025
Speech DF Arena: A Leaderboard for Speech DeepFake Detection Models
Speech DF Arena: A Leaderboard for Speech DeepFake Detection Models
Sandipana Dowerah
Atharva Kulkarni
Ajinkya Kulkarni
Hoan My Tran
Joonas Kalda
Artem Fedorchenko
Benoit Fauve
Damien Lolive
Tanel Alumae
Matthew Magimai Doss
ELM
77
2
0
02 Sep 2025
Zero-Shot KWS for Children's Speech using Layer-Wise Features from SSL Models
Zero-Shot KWS for Children's Speech using Layer-Wise Features from SSL ModelsPattern Recognition Letters (Pattern Recogn. Lett.), 2025
Subham Kutum
Abhijit Sinha
H. Kathania
Sudarsana Reddy Kadiri
Mahesh Chandra Govil
88
1
0
28 Aug 2025
Improving Noise Robust Audio-Visual Speech Recognition via Router-Gated Cross-Modal Feature Fusion
Improving Noise Robust Audio-Visual Speech Recognition via Router-Gated Cross-Modal Feature Fusion
DongHoon Lim
YoungChae Kim
Dong-Hyun Kim
Da-Hee Yang
Joon-Hyuk Chang
98
0
0
26 Aug 2025
Any-to-any Speaker Attribute Perturbation for Asynchronous Voice Anonymization
Any-to-any Speaker Attribute Perturbation for Asynchronous Voice AnonymizationIEEE Transactions on Information Forensics and Security (TIFS), 2025
Liping Chen
Chenyang Guo
Rui Wang
Kong Aik Lee
Zhenhua Ling
AAML
88
1
0
21 Aug 2025
Beyond Transcription: Mechanistic Interpretability in ASR
Beyond Transcription: Mechanistic Interpretability in ASR
Neta Glazer
Yael Segal-Feldman
Hilit Segev
Aviv Shamsian
Asaf Buchnick
Gill Hetz
Ethan Fetaya
Joseph Keshet
Aviv Navon
92
0
0
21 Aug 2025
HuBERT-VIC: Improving Noise-Robust Automatic Speech Recognition of Speech Foundation Model via Variance-Invariance-Covariance Regularization
HuBERT-VIC: Improving Noise-Robust Automatic Speech Recognition of Speech Foundation Model via Variance-Invariance-Covariance Regularization
Hyebin Ahn
Kangwook Jang
Hoirin Kim
92
1
0
17 Aug 2025
Speech Emotion Recognition Using Fine-Tuned DWFormer:A Study on Track 1 of the IERPChallenge 2024
Speech Emotion Recognition Using Fine-Tuned DWFormer:A Study on Track 1 of the IERPChallenge 2024International Symposium on Chinese Spoken Language Processing (ISCSLP), 2024
Honghong Wang
Xupeng Jia
Jing Deng
Rong Zheng
104
0
0
15 Aug 2025
Fake Speech Wild: Detecting Deepfake Speech on Social Media Platform
Fake Speech Wild: Detecting Deepfake Speech on Social Media Platform
Yuankun Xie
Ruibo Fu
Xiaopeng Wang
Zhiyong Wang
Ya Li
Zhengqi Wen
Haonnan Cheng
Long Ye
71
0
0
14 Aug 2025
Multi-Target Backdoor Attacks Against Speaker Recognition
Multi-Target Backdoor Attacks Against Speaker Recognition
Alexandrine Fortier
Sonal Joshi
Thomas Thebaud
Jesus Villalba Lopez
Najim Dehak
P. Cardinal
AAML
256
1
0
12 Aug 2025
ParaNoise-SV: Integrated Approach for Noise-Robust Speaker Verification with Parallel Joint Learning of Speech Enhancement and Noise Extraction
ParaNoise-SV: Integrated Approach for Noise-Robust Speaker Verification with Parallel Joint Learning of Speech Enhancement and Noise Extraction
Minu Kim
Kangwook Jang
Hoirin Kim
72
0
0
10 Aug 2025
Multilingual Source Tracing of Speech Deepfakes: A First Benchmark
Multilingual Source Tracing of Speech Deepfakes: A First Benchmark
Xi Xuan
Yang Xiao
Rohan Kumar Das
Tomi Kinnunen
157
3
0
06 Aug 2025
Keyword Spotting with Hyper-Matched Filters for Small Footprint Devices
Keyword Spotting with Hyper-Matched Filters for Small Footprint Devices
Yael Segal-Feldman
Ann R. Bradlow
Matthew A. Goldrick
Joseph Keshet
ObjD
150
1
0
06 Aug 2025
PatchDSU: Uncertainty Modeling for Out of Distribution Generalization in Keyword Spotting
PatchDSU: Uncertainty Modeling for Out of Distribution Generalization in Keyword Spotting
Bronya R. Chernyak
Yael Segal
Yosi Shrem
Joseph Keshet
OODUQCV
168
0
0
05 Aug 2025
Generalizable Audio Deepfake Detection via Hierarchical Structure Learning and Feature Whitening in Poincaré sphere
Generalizable Audio Deepfake Detection via Hierarchical Structure Learning and Feature Whitening in Poincaré sphere
M. Yang
Yanmei Gu
Qianhua He
Yanxiong Li
Peirong Zhang
Yongqiang Chen
Zhiming Wang
Huijia Zhu
Jian Liu
Weiqiang Wang
130
2
0
03 Aug 2025
Evaluating and Improving the Robustness of Speech Command Recognition Models to Noise and Distribution Shifts
Evaluating and Improving the Robustness of Speech Command Recognition Models to Noise and Distribution Shifts
Anaïs Baranger
Lucas Maison
137
0
0
30 Jul 2025
Whilter: A Whisper-based Data Filter for "In-the-Wild" Speech Corpora Using Utterance-level Multi-Task Classification
Whilter: A Whisper-based Data Filter for "In-the-Wild" Speech Corpora Using Utterance-level Multi-Task Classification
William Ravenscroft
George Close
Kit Bower-Morris
Jamie Stacey
Dmitry Sityaev
Kris Y. Hong
201
1
0
29 Jul 2025
FD-Bench: A Full-Duplex Benchmarking Pipeline Designed for Full Duplex Spoken Dialogue Systems
FD-Bench: A Full-Duplex Benchmarking Pipeline Designed for Full Duplex Spoken Dialogue Systems
Yizhou Peng
Yi-Wen Chao
Dianwen Ng
Yukun Ma
Chongjia Ni
Bin Ma
Eng Siong Chng
ALM
137
3
0
25 Jul 2025
MLLM-based Speech Recognition: When and How is Multimodality Beneficial?
MLLM-based Speech Recognition: When and How is Multimodality Beneficial?
Yiwen Guan
V. Trinh
Vivek Voleti
Jacob Whitehill
211
1
0
25 Jul 2025
Cross-Modal Distillation For Widely Differing Modalities
Cross-Modal Distillation For Widely Differing Modalities
Cairong Zhao
Yufeng Jin
Zifan Song
Haonan Chen
Duoqian Miao
Guosheng Hu
188
1
0
22 Jul 2025
LENS-DF: Deepfake Detection and Temporal Localization for Long-Form Noisy Speech
LENS-DF: Deepfake Detection and Temporal Localization for Long-Form Noisy Speech
Xuechen Liu
W. Ge
Xin Eric Wang
Junichi Yamagishi
143
0
0
22 Jul 2025
MuteSwap: Visual-informed Silent Video Identity Conversion
MuteSwap: Visual-informed Silent Video Identity Conversion
Yifan Liu
Yu Fang
Zhouhan Lin
199
0
0
01 Jul 2025
1234...121314
Next