ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1510.08484
  4. Cited By
MUSAN: A Music, Speech, and Noise Corpus

MUSAN: A Music, Speech, and Noise Corpus

28 October 2015
David Snyder
Guoguo Chen
Daniel Povey
ArXiv (abs)PDFHTML

Papers citing "MUSAN: A Music, Speech, and Noise Corpus"

50 / 664 papers shown
OpenSR: Open-Modality Speech Recognition via Maintaining Multi-Modality
  Alignment
OpenSR: Open-Modality Speech Recognition via Maintaining Multi-Modality AlignmentAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Xize Cheng
Tao Jin
Lin Li
Wang Lin
Xinyu Duan
Zhou Zhao
VLM
266
20
0
10 Jun 2023
Meta-Learning Framework for End-to-End Imposter Identification in Unseen
  Speaker Recognition
Meta-Learning Framework for End-to-End Imposter Identification in Unseen Speaker RecognitionAutomatic Speech Recognition & Understanding (ASRU), 2023
Ashutosh Chaubey
Sparsh Sinha
Susmita Ghose
235
1
0
01 Jun 2023
A Teacher-Student approach for extracting informative speaker embeddings
  from speech mixtures
A Teacher-Student approach for extracting informative speaker embeddings from speech mixturesInterspeech (Interspeech), 2023
Tobias Cord-Landwehr
Christoph Boeddeker
Catalin Zorila
R. Doddipatla
Reinhold Haeb-Umbach
317
5
0
01 Jun 2023
Make-A-Voice: Unified Voice Synthesis With Discrete Representation
Make-A-Voice: Unified Voice Synthesis With Discrete Representation
Rongjie Huang
Chunlei Zhang
Yongqiang Wang
Dongchao Yang
Lu Liu
Zhenhui Ye
Ziyue Jiang
Chao Weng
Zhou Zhao
Dong Yu
DiffM
178
34
0
30 May 2023
Improving Textless Spoken Language Understanding with Discrete Units as
  Intermediate Target
Improving Textless Spoken Language Understanding with Discrete Units as Intermediate TargetInterspeech (Interspeech), 2023
Guanyong Wu
Guan-Ting Lin
Shang-Wen Li
Hung-yi Lee
220
6
0
29 May 2023
One-Step Knowledge Distillation and Fine-Tuning in Using Large
  Pre-Trained Self-Supervised Learning Models for Speaker Verification
One-Step Knowledge Distillation and Fine-Tuning in Using Large Pre-Trained Self-Supervised Learning Models for Speaker VerificationInterspeech (Interspeech), 2023
Ju-Sung Heo
Chan-yeong Lim
Ju-ho Kim
Hyun-Seo Shin
Ha-Jin Yu
249
6
0
27 May 2023
DistriBlock: Identifying adversarial audio samples by leveraging
  characteristics of the output distribution
DistriBlock: Identifying adversarial audio samples by leveraging characteristics of the output distributionConference on Uncertainty in Artificial Intelligence (UAI), 2023
Matías P. Pizarro
D. Kolossa
Asja Fischer
AAML
496
2
0
26 May 2023
Visualizing data augmentation in deep speaker recognition
Visualizing data augmentation in deep speaker recognitionInterspeech (Interspeech), 2023
Pengqi Li
Lantian Li
A. Hamdulla
D. Wang
127
4
0
25 May 2023
AV-TranSpeech: Audio-Visual Robust Speech-to-Speech Translation
AV-TranSpeech: Audio-Visual Robust Speech-to-Speech TranslationAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Rongjie Huang
Huadai Liu
Xize Cheng
Yi Ren
Lin Li
...
Jinzheng He
Lichao Zhang
Jinglin Liu
Xiaoyue Yin
Zhou Zhao
209
10
0
24 May 2023
P-vectors: A Parallel-Coupled TDNN/Transformer Network for Speaker
  Verification
P-vectors: A Parallel-Coupled TDNN/Transformer Network for Speaker VerificationInterspeech (Interspeech), 2023
Xiyuan Wang
Fangyuan Wang
Bo Xu
Liang Xu
Jing Xiao
214
6
0
24 May 2023
Multi-Stream Extension of Variational Bayesian HMM Clustering (MS-VBx)
  for Combined End-to-End and Vector Clustering-based Diarization
Multi-Stream Extension of Variational Bayesian HMM Clustering (MS-VBx) for Combined End-to-End and Vector Clustering-based DiarizationInterspeech (Interspeech), 2023
Marc Delcroix
Naohiro Tawara
Mireia Díez
Federico Landini
Anna Silnova
A. Ogawa
Tomohiro Nakatani
L. Burget
S. Araki
160
7
0
23 May 2023
An Enhanced Res2Net with Local and Global Feature Fusion for Speaker
  Verification
An Enhanced Res2Net with Local and Global Feature Fusion for Speaker VerificationInterspeech (Interspeech), 2023
Yafeng Chen
Siqi Zheng
Haibo Wang
Luyao Cheng
Qian Chen
Jiajun Qi
187
63
0
22 May 2023
Progressive Sub-Graph Clustering Algorithm for Semi-Supervised Domain
  Adaptation Speaker Verification
Progressive Sub-Graph Clustering Algorithm for Semi-Supervised Domain Adaptation Speaker VerificationInternational Conference on Signal Processing, Communications and Computing (ICSPCC), 2023
Zhuo Li
Jingze Lu
Z. Zhao
Wenchao Wang
Pengyuan Zhang
153
1
0
22 May 2023
The HCCL system for VoxCeleb Speaker Recognition Challenge 2022
The HCCL system for VoxCeleb Speaker Recognition Challenge 2022
Zhenduo Zhao
Zhuo Li
Wenchao Wang
Pengyuan Zhang
111
4
0
22 May 2023
On the Efficacy and Noise-Robustness of Jointly Learned Speech Emotion
  and Automatic Speech Recognition
On the Efficacy and Noise-Robustness of Jointly Learned Speech Emotion and Automatic Speech RecognitionInterspeech (Interspeech), 2023
L. Bansal
S. P. Dubagunta
Malolan Chetlur
Pushpak Jagtap
A. Ganapathiraju
180
1
0
21 May 2023
Towards Robust Family-Infant Audio Analysis Based on Unsupervised
  Pretraining of Wav2vec 2.0 on Large-Scale Unlabeled Family Audio
Towards Robust Family-Infant Audio Analysis Based on Unsupervised Pretraining of Wav2vec 2.0 on Large-Scale Unlabeled Family AudioInterspeech (Interspeech), 2023
Jialu Li
M. Hasegawa-Johnson
Nancy L. McElwain
258
14
0
21 May 2023
Blank-regularized CTC for Frame Skipping in Neural Transducer
Blank-regularized CTC for Frame Skipping in Neural TransducerInterspeech (Interspeech), 2023
Yifan Yang
Xiaoyu Yang
Liyong Guo
Zengwei Yao
Wei Kang
Fangjun Kuang
Long Lin
Xie Chen
Daniel Povey
136
11
0
19 May 2023
Cross-Modal Global Interaction and Local Alignment for Audio-Visual
  Speech Recognition
Cross-Modal Global Interaction and Local Alignment for Audio-Visual Speech RecognitionInternational Joint Conference on Artificial Intelligence (IJCAI), 2023
Yuchen Hu
Ruizhe Li
Chen Chen
Heqing Zou
Qiu-shi Zhu
Eng Siong Chng
212
14
0
16 May 2023
Ripple sparse self-attention for monaural speech enhancement
Ripple sparse self-attention for monaural speech enhancementIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Qiquan Zhang
Hongxu Zhu
Qi Song
Xinyuan Qian
Zhaoheng Ni
Haizhou Li
100
9
0
15 May 2023
Deep Audio-Visual Singing Voice Transcription based on Self-Supervised
  Learning Models
Deep Audio-Visual Singing Voice Transcription based on Self-Supervised Learning Models
Xiangming Gu
Weizhen Zeng
Jianan Zhang
Longshen Ou
Ye Wang
246
6
0
24 Apr 2023
Multi-channel Speech Separation Using Spatially Selective Deep
  Non-linear Filters
Multi-channel Speech Separation Using Spatially Selective Deep Non-linear FiltersIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023
Kristina Tesch
Timo Gerkmann
168
34
0
24 Apr 2023
MER 2023: Multi-label Learning, Modality Robustness, and Semi-Supervised
  Learning
MER 2023: Multi-label Learning, Modality Robustness, and Semi-Supervised LearningACM Multimedia (ACM MM), 2023
Zheng Lian
Haiyang Sun
Guoying Zhao
Kang Chen
Mingyu Xu
...
Meng Wang
Xiaoshi Zhong
Guoying Zhao
Björn W. Schuller
Jianhua Tao
269
81
0
18 Apr 2023
Fast Random Approximation of Multi-channel Room Impulse Response
Fast Random Approximation of Multi-channel Room Impulse Response
Yi Luo
Rongzhi Gu
204
8
0
17 Apr 2023
Efficient Sequence Transduction by Jointly Predicting Tokens and
  Durations
Efficient Sequence Transduction by Jointly Predicting Tokens and DurationsInternational Conference on Machine Learning (ICML), 2023
Hainan Xu
Fei Jia
Somshubra Majumdar
Hengguan Huang
Shinji Watanabe
Boris Ginsburg
180
44
0
13 Apr 2023
Self-Supervised Learning with Cluster-Aware-DINO for High-Performance
  Robust Speaker Verification
Self-Supervised Learning with Cluster-Aware-DINO for High-Performance Robust Speaker VerificationIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023
Bing Han
Zhengyang Chen
Y. Qian
143
36
0
12 Apr 2023
Margin-Mixup: A Method for Robust Speaker Verification in Multi-Speaker
  Audio
Margin-Mixup: A Method for Robust Speaker Verification in Multi-Speaker AudioIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Jenthe Thienpondt
N. Madhu
Kris Demuynck
124
7
0
07 Apr 2023
To Wake-up or Not to Wake-up: Reducing Keyword False Alarm by Successive
  Refinement
To Wake-up or Not to Wake-up: Reducing Keyword False Alarm by Successive RefinementIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Yashas Malur Saidutta
R. S. Srinivasa
Ching Hua Lee
Chouchang Yang
Yilin Shen
Hongxia Jin
154
2
0
06 Apr 2023
Cluster-Guided Unsupervised Domain Adaptation for Deep Speaker Embedding
Cluster-Guided Unsupervised Domain Adaptation for Deep Speaker EmbeddingIEEE Signal Processing Letters (IEEE SPL), 2023
Haiquan Mao
Fenglu Hong
Man-Wai Mak
183
10
0
28 Mar 2023
Exploring Turkish Speech Recognition via Hybrid CTC/Attention
  Architecture and Multi-feature Fusion Network
Exploring Turkish Speech Recognition via Hybrid CTC/Attention Architecture and Multi-feature Fusion Network
Zeyu Ren
Nurmemet Yolwas
Huiru Wang
Wushour Slamu
75
0
0
22 Mar 2023
DS-TDNN: Dual-stream Time-delay Neural Network with Global-aware Filter
  for Speaker Verification
DS-TDNN: Dual-stream Time-delay Neural Network with Global-aware Filter for Speaker VerificationIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023
Yangfu Li
Jiapan Gan
Xiaodan Lin
275
9
0
20 Mar 2023
ERSAM: Neural Architecture Search For Energy-Efficient and Real-Time Social Ambiance Measurement
ERSAM: Neural Architecture Search For Energy-Efficient and Real-Time Social Ambiance MeasurementIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Chaojian Li
Wenwan Chen
Jiayi Yuan
Yingyan Lin
Ashutosh Sabharwal
245
0
0
19 Mar 2023
Enhancing Unsupervised Audio Representation Learning via Adversarial
  Sample Generation
Enhancing Unsupervised Audio Representation Learning via Adversarial Sample Generation
Yulin Pan
Xiangteng He
Biao Gong
Yuxin Peng
Yiliang Lv
SSL
117
0
0
15 Mar 2023
Neural Diarization with Non-autoregressive Intermediate Attractors
Neural Diarization with Non-autoregressive Intermediate AttractorsIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Yusuke Fujita
Tatsuya Komatsu
Robin Scheibler
Yusuke Kida
Tetsuji Ogawa
223
14
0
13 Mar 2023
MixSpeech: Cross-Modality Self-Learning with Audio-Visual Stream Mixup
  for Visual Speech Translation and Recognition
MixSpeech: Cross-Modality Self-Learning with Audio-Visual Stream Mixup for Visual Speech Translation and RecognitionIEEE International Conference on Computer Vision (ICCV), 2023
Xize Cheng
Lin Li
Tao Jin
Rongjie Huang
Wang Lin
Zehan Wang
Huangdai Liu
Yejin Wang
Aoxiong Yin
Zhou Zhao
210
29
0
09 Mar 2023
TOLD: A Novel Two-Stage Overlap-Aware Framework for Speaker Diarization
TOLD: A Novel Two-Stage Overlap-Aware Framework for Speaker DiarizationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Jiaming Wang
Zhihao Du
Shiliang Zhang
124
8
0
08 Mar 2023
Improving Transformer-based End-to-End Speaker Diarization by Assigning
  Auxiliary Losses to Attention Heads
Improving Transformer-based End-to-End Speaker Diarization by Assigning Auxiliary Losses to Attention HeadsIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Ye-Rin Jeoung
Joon-Young Yang
Jeong-Hwan Choi
Joon‐Hyuk Chang
70
15
0
02 Mar 2023
Distilling Multi-Level X-vector Knowledge for Small-footprint Speaker
  Verification
Distilling Multi-Level X-vector Knowledge for Small-footprint Speaker Verification
Xuechen Liu
Md. Sahidullah
Tomi Kinnunen
274
6
0
02 Mar 2023
MuAViC: A Multilingual Audio-Visual Corpus for Robust Speech Recognition
  and Robust Speech-to-Text Translation
MuAViC: A Multilingual Audio-Visual Corpus for Robust Speech Recognition and Robust Speech-to-Text TranslationInterspeech (Interspeech), 2023
Mohamed Anwar
Bowen Shi
Vedanuj Goswami
Wei-Ning Hsu
J. Pino
Changhan Wang
227
44
0
01 Mar 2023
CAM++: A Fast and Efficient Network for Speaker Verification Using
  Context-Aware Masking
CAM++: A Fast and Efficient Network for Speaker Verification Using Context-Aware MaskingInterspeech (Interspeech), 2023
Haibo Wang
Siqi Zheng
Yafeng Chen
Luyao Cheng
Qian Chen
160
157
0
01 Mar 2023
Distance-based Weight Transfer from Near-field to Far-field Speaker
  Verification
Distance-based Weight Transfer from Near-field to Far-field Speaker VerificationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Li Zhang
Qing Wang
Hongji Wang
Yue Li
Wei Rao
Yannan Wang
Linfu Xie
203
5
0
01 Mar 2023
PCF: ECAPA-TDNN with Progressive Channel Fusion for Speaker Verification
PCF: ECAPA-TDNN with Progressive Channel Fusion for Speaker VerificationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Z. Zhao
Zhuo Li
Wenchao Wang
Pengyuan Zhang
118
33
0
01 Mar 2023
Practice of the conformer enhanced AUDIO-VISUAL HUBERT on Mandarin and
  English
Practice of the conformer enhanced AUDIO-VISUAL HUBERT on Mandarin and EnglishIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Xiaoming Ren
Chao Li
Shenjian Wang
Biao Li
130
0
0
28 Feb 2023
Ensemble knowledge distillation of self-supervised speech models
Ensemble knowledge distillation of self-supervised speech modelsIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Kuan-Po Huang
Tzu-hsun Feng
Yu-Kuan Fu
Tsung-Yuan Hsu
Po-Chieh Yen
Wei-Cheng Tseng
Kai-Wei Chang
Hung-yi Lee
278
21
0
24 Feb 2023
Cross-modal Audio-visual Co-learning for Text-independent Speaker
  Verification
Cross-modal Audio-visual Co-learning for Text-independent Speaker VerificationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Meng Liu
Kong Aik Lee
Longbiao Wang
Hanyi Zhang
Chang Zeng
Jianwu Dang
171
13
0
22 Feb 2023
Advancing Stuttering Detection via Data Augmentation, Class-Balanced
  Loss and Multi-Contextual Deep Learning
Advancing Stuttering Detection via Data Augmentation, Class-Balanced Loss and Multi-Contextual Deep LearningIEEE journal of biomedical and health informatics (IEEE JBHI), 2023
S. A. Sheikh
Md. Sahidullah
F. Hirsch
Slim Ouni
194
23
0
21 Feb 2023
VoxSRC 2022: The Fourth VoxCeleb Speaker Recognition Challenge
VoxSRC 2022: The Fourth VoxCeleb Speaker Recognition Challenge
Jaesung Huh
A. Brown
Jee-weon Jung
Joon Son Chung
Arsha Nagrani
D. Garcia-Romero
Andrew Zisserman
233
30
0
20 Feb 2023
RobustDistiller: Compressing Universal Speech Representations for
  Enhanced Environment Robustness
RobustDistiller: Compressing Universal Speech Representations for Enhanced Environment RobustnessIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Heitor R. Guimarães
Arthur Pimentel
Anderson R. Avila
Mehdi Rezagholizadeh
Boxing Chen
Tiago H. Falk
362
12
0
18 Feb 2023
Improving Transformer-based Networks With Locality For Automatic Speaker
  Verification
Improving Transformer-based Networks With Locality For Automatic Speaker VerificationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Mufan Sang
Yong Zhao
Gang Liu
John H. L. Hansen
Jian Wu
ViT
216
15
0
17 Feb 2023
Cross-Corpora Spoken Language Identification with Domain Diversification
  and Generalization
Cross-Corpora Spoken Language Identification with Domain Diversification and GeneralizationComputer Speech and Language (CSL), 2023
Spandan Dey
Md. Sahidullah
G. Saha
142
13
0
10 Feb 2023
MooseNet: A Trainable Metric for Synthesized Speech with a PLDA Module
MooseNet: A Trainable Metric for Synthesized Speech with a PLDA ModuleSpeech Synthesis Workshop (SSW), 2023
Ondvrej Plátek
Ondrej Dusek
188
2
0
17 Jan 2023
Previous
123...567...121314
Next