MUSAN: A Music, Speech, and Noise Corpus

28 October 2015

Papers citing "MUSAN: A Music, Speech, and Noise Corpus"

50 / 664 papers shown

NIST SRE CTS Superset: A large-scale dataset for telephony speaker recognition

S. O. Sadjadi

AI4TS

16 Aug 2021

Xi-Vector Embedding for Speaker RecognitionIEEE Signal Processing Letters (IEEE SPL), 2021

Kong Aik Lee

Qiongqiong Wang

Takafumi Koshinaka

BDL

12 Aug 2021

Multi-channel Speech Enhancement with 2-D Convolutional Time-frequency Domain Features and a Pre-trained Acoustic ModelAsia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 2021

Liyong Guo

Yujun Wang

234

23 Jul 2021

Is Someone Speaking? Exploring Long-term Temporal Features for Audio-visual Active Speaker DetectionACM Multimedia (ACM MM), 2021

Ruijie Tao

Zexu Pan

Rohan Kumar Das

Xinyuan Qian

Mike Zheng Shou

Haizhou Li

205

218

14 Jul 2021

DPCRN: Dual-Path Convolution Recurrent Network for Single Channel Speech Enhancement

212

101

12 Jul 2021

MACCIF-TDNN: Multi aspect aggregation of channel and context interdependence features in TDNN-based speaker verification

Fangyuan Wang

Z. Song

Hongchen Jiang

Bo Xu

102

07 Jul 2021

The HCCL Speaker Verification System for Far-Field Speaker Verification Challenge

Zhuo Li

125

03 Jul 2021

An Integrated Framework for Two-pass Personalized Voice TriggerInterspeech (Interspeech), 2021

180

30 Jun 2021

A Simultaneous Denoising and Dereverberation Framework with Target Decoupling

175

24 Jun 2021

Multi-Level Transfer Learning from Near-Field to Far-Field Speaker Verification

Li Zhang

Qing Wang

Kong Aik Lee

Lei Xie

Haizhou Li

165

17 Jun 2021

End-to-end Neural Diarization: From Transformer to ConformerInterspeech (Interspeech), 2021

208

14 Jun 2021

Noise Classification Aided Attention-Based Neural Network for Monaural Speech Enhancement

119

31 May 2021

DIVE: End-to-end Speech Diarization via Iterative Speaker EmbeddingAutomatic Speech Recognition & Understanding (ASRU), 2021

Neil Zeghidour

O. Teboul

David Grangier

125

28 May 2021

Cross-Referencing Self-Training Network for Sound Event Detection in Audio MixturesIEEE transactions on multimedia (IEEE Trans. Multimedia), 2021

Sangwook Park

D. Han

Mounya Elhilali

180

27 May 2021

Advances in integration of end-to-end neural and clustering-based diarization for real conversational speechInterspeech (Interspeech), 2021

K. Kinoshita

Marc Delcroix

Naohiro Tawara

252

19 May 2021

X-Vectors with Multi-Scale Aggregation for Speaker Diarization

Myung-Jae Kim

V. Apsingekar

Divya Neelagiri

119

16 May 2021

Study on the temporal pooling used in deep neural networks for speaker verificationEuropean Signal Processing Conference (EUSIPCO), 2021

Mickael Rouvier

Pierre-Michel Bousquet

J. Duret

136

10 May 2021

Voice activity detection in the wild: A data-driven approach using teacher-student trainingIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2021

Shuai Wang

117

10 May 2021

Test-Time Adaptation Toward Personalized Speech Enhancement: Zero-Shot Learning with Knowledge DistillationIEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), 2021

Sunwoo Kim

Minje Kim

174

08 May 2021

Zero-Shot Personalized Speech Enhancement through Speaker-Informed Model SelectionIEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), 2021

Aswin Sivaraman

Minje Kim

157

08 May 2021

Multimodal Self-Supervised Learning of General Audio Representations

Luyu Wang

Pauline Luc

Adrià Recasens

Jean-Baptiste Alayrac

Aaron van den Oord

SSL

252

26 Apr 2021

Fusing information streams in end-to-end audio-visual speech recognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021

Wentao Yu

Steffen Zeiler

D. Kolossa

242

19 Apr 2021

Learning Metrics from Mean Teacher: A Supervised Learning Method for Improving the Generalization of Speaker Verification System

Ju-ho Kim

Hye-jin Shim

Jee-weon Jung

Ha-Jin Yu

182

14 Apr 2021

End-to-end speaker segmentation for overlap-aware resegmentationInterspeech (Interspeech), 2021

H. Bredin

Antoine Laurent

VLM

619

197

08 Apr 2021

Utilizing Self-supervised Representations for MOS PredictionInterspeech (Interspeech), 2021

404

07 Apr 2021

Three-class Overlapped Speech Detection using a Convolutional Recurrent Neural NetworkInterspeech (Interspeech), 2021

273

07 Apr 2021

Personalized Speech Enhancement through Self-Supervised Data Augmentation and PurificationInterspeech (Interspeech), 2021

Aswin Sivaraman

Sunwoo Kim

Minje Kim

232

05 Apr 2021

Efficient Personalized Speech Enhancement through Self-Supervised LearningIEEE Journal on Selected Topics in Signal Processing (JSTSP), 2021

Aswin Sivaraman

Minje Kim

226

05 Apr 2021

Attention Back-end for Automatic Speaker Verification with Multiple Enrollment UtterancesIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021

Chang Zeng

Xin Wang

Erica Cooper

Xiaoxiao Miao

Junichi Yamagishi

168

04 Apr 2021

INTERSPEECH 2021 ConferencingSpeech Challenge: Towards Far-field Multi-Channel Speech Enhancement for Video Conferencing

Xin Xu

...

Hui Bu

149

02 Apr 2021

Multilingual and code-switching ASR challenges for low resource Indian languagesInterspeech (Interspeech), 2021

...

Karthik Sankaranarayanan

Tejaswi Seeram

Basil Abraham

138

109

01 Apr 2021

Auto-KWS 2021 Challenge: Task, Datasets, and BaselinesInterspeech (Interspeech), 2021

Qijie Shao

Lei Xie

118

31 Mar 2021

Quantifying Bias in Automatic Speech Recognition

185

28 Mar 2021

EfficientTDNN: Efficient Architecture Search for Speaker RecognitionIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2021

Rui Wang

263

25 Mar 2021

USTC-NELSLIP System Description for DIHARD-III Challenge

Shutong Niu

Tian Gao

Jia Pan

143

19 Mar 2021

Learning spectro-temporal representations of complex sounds with parameterized neural networksJournal of the Acoustical Society of America (JASA), 2021

Rachid Riad

Julien Karadayi

Anne-Catherine Bachoud-Lévi

Emmanuel Dupoux

141

12 Mar 2021

An Ultra-low Power RNN Classifier for Always-On Voice Wake-Up Detection Robust to Real-World Scenarios

E. Hardy

F. Badets

08 Mar 2021

The NPU System for the 2020 Personalized Voice Trigger Challenge

Qijie Shao

Lei Xie

123

26 Feb 2021

Artificially Synthesising Data for Audio Classification and Segmentation to Improve Speech and Music Detection in Radio BroadcastIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021

...

19 Feb 2021

AISPEECH-SJTU accent identification system for the Accented English Speech Recognition ChallengeIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021

Rao Ma

164

19 Feb 2021

An Investigation of End-to-End Models for Robust Speech RecognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021

Archiki Prasad

Preethi Jyothi

R. Velmurugan

150

11 Feb 2021

The DKU-Duke-Lenovo System Description for the Third DIHARD Speech Diarization Challenge

170

06 Feb 2021

The Hitachi-JHU DIHARD III System: Competitive End-to-End Neural Diarization and X-Vector Clustering Systems Combined by DOVER-Lap

Shota Horiguchi

Nelson Yalta

Leibny Paola García-Perera

Sanjeev Khudanpur

117

02 Feb 2021

Online Streaming End-to-End Neural Diarization Handling Overlapping Speech and Flexible Numbers of Speakers

Leibny Paola García-Perera

Kenji Nagamatsu

190

21 Jan 2021

A Principle Solution for Enroll-Test Mismatch in Speaker RecognitionIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2020

Dong Wang

143

23 Dec 2020

CN-Celeb: multi-genre speaker recognitionSpeech Communication (Speech Commun.), 2020

Hao Cui

Dong Wang

220

143

23 Dec 2020

End-to-End Speaker Diarization as Post-ProcessingIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2020

Shota Horiguchi

Leibny Paola García-Perera

Yusuke Fujita

Shinji Watanabe

Kenji Nagamatsu

231

18 Dec 2020

VoxSRC 2020: The Second VoxCeleb Speaker Recognition Challenge

Joon Son Chung

Andrew Brown

162

12 Dec 2020

One Shot Learning for Speech SeparationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2020

Kuan-Po Huang

163

20 Nov 2020

Towards Semi-Supervised Semantics Understanding from Speech

185

11 Nov 2020