MUSAN: A Music, Speech, and Noise Corpus

28 October 2015

Papers citing "MUSAN: A Music, Speech, and Noise Corpus"

50 / 664 papers shown

OpenSR: Open-Modality Speech Recognition via Maintaining Multi-Modality AlignmentAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

Xize Cheng

Zhou Zhao

266

10 Jun 2023

Meta-Learning Framework for End-to-End Imposter Identification in Unseen Speaker RecognitionAutomatic Speech Recognition & Understanding (ASRU), 2023

Ashutosh Chaubey

Sparsh Sinha

Susmita Ghose

235

01 Jun 2023

A Teacher-Student approach for extracting informative speaker embeddings from speech mixturesInterspeech (Interspeech), 2023

317

01 Jun 2023

Make-A-Voice: Unified Voice Synthesis With Discrete Representation

Rongjie Huang

Dongchao Yang

Zhou Zhao

Dong Yu

DiffM

178

30 May 2023

Improving Textless Spoken Language Understanding with Discrete Units as Intermediate TargetInterspeech (Interspeech), 2023

Guanyong Wu

Guan-Ting Lin

Shang-Wen Li

Hung-yi Lee

220

29 May 2023

One-Step Knowledge Distillation and Fine-Tuning in Using Large Pre-Trained Self-Supervised Learning Models for Speaker VerificationInterspeech (Interspeech), 2023

249

27 May 2023

DistriBlock: Identifying adversarial audio samples by leveraging characteristics of the output distributionConference on Uncertainty in Artificial Intelligence (UAI), 2023

496

26 May 2023

Visualizing data augmentation in deep speaker recognitionInterspeech (Interspeech), 2023

127

25 May 2023

AV-TranSpeech: Audio-Visual Robust Speech-to-Speech TranslationAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

Rongjie Huang

Xize Cheng

...

Zhou Zhao

209

24 May 2023

P-vectors: A Parallel-Coupled TDNN/Transformer Network for Speaker VerificationInterspeech (Interspeech), 2023

Bo Xu

214

24 May 2023

Multi-Stream Extension of Variational Bayesian HMM Clustering (MS-VBx) for Combined End-to-End and Vector Clustering-based DiarizationInterspeech (Interspeech), 2023

160

23 May 2023

An Enhanced Res2Net with Local and Global Feature Fusion for Speaker VerificationInterspeech (Interspeech), 2023

Siqi Zheng

Qian Chen

187

22 May 2023

Progressive Sub-Graph Clustering Algorithm for Semi-Supervised Domain Adaptation Speaker VerificationInternational Conference on Signal Processing, Communications and Computing (ICSPCC), 2023

Zhuo Li

Jingze Lu

Z. Zhao

Wenchao Wang

Pengyuan Zhang

153

22 May 2023

The HCCL system for VoxCeleb Speaker Recognition Challenge 2022

Zhenduo Zhao

Zhuo Li

Wenchao Wang

Pengyuan Zhang

111

22 May 2023

On the Efficacy and Noise-Robustness of Jointly Learned Speech Emotion and Automatic Speech RecognitionInterspeech (Interspeech), 2023

180

21 May 2023

Towards Robust Family-Infant Audio Analysis Based on Unsupervised Pretraining of Wav2vec 2.0 on Large-Scale Unlabeled Family AudioInterspeech (Interspeech), 2023

Jialu Li

M. Hasegawa-Johnson

Nancy L. McElwain

258

21 May 2023

Blank-regularized CTC for Frame Skipping in Neural TransducerInterspeech (Interspeech), 2023

Yifan Yang

Xiaoyu Yang

Liyong Guo

Zengwei Yao

Wei Kang

Fangjun Kuang

Long Lin

Xie Chen

Daniel Povey

136

19 May 2023

Cross-Modal Global Interaction and Local Alignment for Audio-Visual Speech RecognitionInternational Joint Conference on Artificial Intelligence (IJCAI), 2023

Yuchen Hu

Chen Chen

212

16 May 2023

Ripple sparse self-attention for monaural speech enhancementIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

Xinyuan Qian

Haizhou Li

100

15 May 2023

Deep Audio-Visual Singing Voice Transcription based on Self-Supervised Learning Models

246

24 Apr 2023

Multi-channel Speech Separation Using Spatially Selective Deep Non-linear FiltersIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023

Kristina Tesch

Timo Gerkmann

168

24 Apr 2023

MER 2023: Multi-label Learning, Modality Robustness, and Semi-Supervised LearningACM Multimedia (ACM MM), 2023

...

Björn W. Schuller

269

18 Apr 2023

Fast Random Approximation of Multi-channel Room Impulse Response

Yi Luo

Rongzhi Gu

204

17 Apr 2023

Efficient Sequence Transduction by Jointly Predicting Tokens and DurationsInternational Conference on Machine Learning (ICML), 2023

Hainan Xu

Boris Ginsburg

180

13 Apr 2023

Self-Supervised Learning with Cluster-Aware-DINO for High-Performance Robust Speaker VerificationIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023

Bing Han

Zhengyang Chen

Y. Qian

143

12 Apr 2023

Margin-Mixup: A Method for Robust Speaker Verification in Multi-Speaker AudioIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

Jenthe Thienpondt

N. Madhu

Kris Demuynck

124

07 Apr 2023

To Wake-up or Not to Wake-up: Reducing Keyword False Alarm by Successive RefinementIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

Yashas Malur Saidutta

154

06 Apr 2023

Cluster-Guided Unsupervised Domain Adaptation for Deep Speaker EmbeddingIEEE Signal Processing Letters (IEEE SPL), 2023

Haiquan Mao

Fenglu Hong

Man-Wai Mak

183

28 Mar 2023

Exploring Turkish Speech Recognition via Hybrid CTC/Attention Architecture and Multi-feature Fusion Network

22 Mar 2023

DS-TDNN: Dual-stream Time-delay Neural Network with Global-aware Filter for Speaker VerificationIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023

Yangfu Li

Jiapan Gan

Xiaodan Lin

275

20 Mar 2023

ERSAM: Neural Architecture Search For Energy-Efficient and Real-Time Social Ambiance MeasurementIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

245

19 Mar 2023

Enhancing Unsupervised Audio Representation Learning via Adversarial Sample Generation

Yuxin Peng

117

15 Mar 2023

Neural Diarization with Non-autoregressive Intermediate AttractorsIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

223

13 Mar 2023

MixSpeech: Cross-Modality Self-Learning with Audio-Visual Stream Mixup for Visual Speech Translation and RecognitionIEEE International Conference on Computer Vision (ICCV), 2023

Xize Cheng

Rongjie Huang

Zhou Zhao

210

09 Mar 2023

TOLD: A Novel Two-Stage Overlap-Aware Framework for Speaker DiarizationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

Jiaming Wang

Zhihao Du

Shiliang Zhang

124

08 Mar 2023

Improving Transformer-based End-to-End Speaker Diarization by Assigning Auxiliary Losses to Attention HeadsIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

Ye-Rin Jeoung

Joon-Young Yang

Jeong-Hwan Choi

Joon‐Hyuk Chang

02 Mar 2023

Distilling Multi-Level X-vector Knowledge for Small-footprint Speaker Verification

Xuechen Liu

Md. Sahidullah

Tomi Kinnunen

274

02 Mar 2023

MuAViC: A Multilingual Audio-Visual Corpus for Robust Speech Recognition and Robust Speech-to-Text TranslationInterspeech (Interspeech), 2023

227

01 Mar 2023

CAM++: A Fast and Efficient Network for Speaker Verification Using Context-Aware MaskingInterspeech (Interspeech), 2023

Haibo Wang

Siqi Zheng

Yafeng Chen

Luyao Cheng

Qian Chen

160

157

01 Mar 2023

Distance-based Weight Transfer from Near-field to Far-field Speaker VerificationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

203

01 Mar 2023

PCF: ECAPA-TDNN with Progressive Channel Fusion for Speaker VerificationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

Z. Zhao

Zhuo Li

Wenchao Wang

Pengyuan Zhang

118

01 Mar 2023

Practice of the conformer enhanced AUDIO-VISUAL HUBERT on Mandarin and EnglishIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

130

28 Feb 2023

Ensemble knowledge distillation of self-supervised speech modelsIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

Kuan-Po Huang

278

24 Feb 2023

Cross-modal Audio-visual Co-learning for Text-independent Speaker VerificationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

Meng Liu

Kong Aik Lee

Longbiao Wang

Hanyi Zhang

Chang Zeng

Jianwu Dang

171

22 Feb 2023

Advancing Stuttering Detection via Data Augmentation, Class-Balanced Loss and Multi-Contextual Deep LearningIEEE journal of biomedical and health informatics (IEEE JBHI), 2023

194

21 Feb 2023

VoxSRC 2022: The Fourth VoxCeleb Speaker Recognition Challenge

233

20 Feb 2023

RobustDistiller: Compressing Universal Speech Representations for Enhanced Environment RobustnessIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

362

18 Feb 2023

Improving Transformer-based Networks With Locality For Automatic Speaker VerificationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

216

17 Feb 2023

Cross-Corpora Spoken Language Identification with Domain Diversification and GeneralizationComputer Speech and Language (CSL), 2023

Spandan Dey

Md. Sahidullah

G. Saha

142

10 Feb 2023

MooseNet: A Trainable Metric for Synthesized Speech with a PLDA ModuleSpeech Synthesis Workshop (SSW), 2023

Ondvrej Plátek

Ondrej Dusek

188

17 Jan 2023