MUSAN: A Music, Speech, and Noise Corpus

28 October 2015

Papers citing "MUSAN: A Music, Speech, and Noise Corpus"

50 / 664 papers shown

Yet Another Model for Arabic Dialect Identification

Ajinkya Kulkarni

Hanan Aldarmaki

151

20 Oct 2023

The CHiME-7 Challenge: System Description and Performance of NeMo Team's DASR System

T. Park

He Huang

Ante Jukić

Kunal Dhawan

Krishna C. Puvvada

Nithin Rao Koluguri

Nikolay Karpov

A. Laptev

Jagadeesh Balam

Boris Ginsburg

202

18 Oct 2023

End-to-end Online Speaker Diarization with Target Speaker Tracking

Weiqing Wang

Ming Li

315

12 Oct 2023

LRPD: Large Replay Parallel DatasetIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

265

29 Sep 2023

Low-Resource Self-Supervised Learning with SSL-Enhanced TTS

Yossi Adi

173

29 Sep 2023

Audio-Visual Speaker Verification via Joint Cross-AttentionInternational Conference on Speech and Computer (SPECOM), 2023

R Gnana Praveen

Jahangir Alam

277

28 Sep 2023

Meeting Recognition with Continuous Speech Separation and Transcription-Supported Diarization

285

28 Sep 2023

DualVC 2: Dynamic Masked Convolution for Unified Streaming and Non-Streaming Voice ConversionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

Ziqian Ning

Yuepeng Jiang

Pengcheng Zhu

Shuai Wang

Jixun Yao

Linfu Xie

Mengxiao Bi

294

27 Sep 2023

Collaborative Watermarking for Adversarial Speech SynthesisIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

Lauri Juvela

Xin Wang

230

26 Sep 2023

Emphasized Non-Target Speaker Knowledge in Knowledge Distillation for Automatic Speaker VerificationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

Kong Aik Lee

187

26 Sep 2023

Rethinking Session Variability: Leveraging Session Embeddings for Session Robustness in Speaker VerificationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

263

26 Sep 2023

Multi-Domain Adaptation by Self-Supervised Learning for Speaker Verification

Wan Lin

Lantian Li

D. Wang

116

25 Sep 2023

Frame-wise streaming end-to-end speaker diarization with non-autoregressive self-attention-based attractorsIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

Di Liang

Nian Shao

Xiaofei Li

194

25 Sep 2023

Contrastive Speaker Embedding With Sequential DisentanglementIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

160

23 Sep 2023

Reduce, Reuse, Recycle: Is Perturbed Data better than Other Language augmentation for Low Resource Self-Supervised Speech ModelsInterspeech (Interspeech), 2023

Asad Ullah

Alessandro Ragano

Andrew Hines

428

22 Sep 2023

NTT speaker diarization system for CHiME-7: multi-domain, multi-microphone End-to-end and vector clustering diarizationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

Naohiro Tawara

Marc Delcroix

Atsushi Ando

A. Ogawa

208

22 Sep 2023

A Multiscale Autoencoder (MSAE) Framework for End-to-End Neural Network Speech EnhancementIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023

Bengt J. Borgström

M. Brandstein

173

21 Sep 2023

The Impact of Silence on Speech Anti-SpoofingIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023

Zhuo Li

Pengyuan Zhang

193

21 Sep 2023

Refining DNN-based Mask Estimation using CGMM-based EM Algorithm for Multi-channel Noise ReductionInterspeech (Interspeech), 2022

Julitta Bartolewska

Stanisław Kacprzak

K. Kowalczyk

122

18 Sep 2023

Diff-SV: A Unified Hierarchical Framework for Noise-Robust Speaker Verification Using Score-Based Diffusion Probabilistic ModelsIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

160

14 Sep 2023

PromptASR for contextualized ASR with controllable styleIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

Xiaoyu Yang

Wei Kang

Zengwei Yao

Yifan Yang

Liyong Guo

Fangjun Kuang

Long Lin

Daniel Povey

344

14 Sep 2023

SynVox2: Towards a privacy-friendly VoxCeleb2 datasetIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

243

12 Sep 2023

LeBenchmark 2.0: a Standardized, Replicable and Enhanced Framework for Self-supervised Representations of French SpeechComputer Speech and Language (CSL), 2023

...

262

11 Sep 2023

Hierarchical Audio-Visual Information Fusion with Multi-label Joint Decoding for MER 2023ACM Multimedia (ACM MM), 2023

...

213

11 Sep 2023

ReZero: Region-customizable Sound Extraction

Rongzhi Gu

Yi Luo

147

31 Aug 2023

The USTC-NERCSLIP Systems for the CHiME-7 DASR Challenge

Ruoyu Wang

Maokui He

Jun Du

Hengshun Zhou

Shutong Niu

...

Mengzhi Wang

Genshun Wan

Jia Pan

Jianqing Gao

Chin-Hui Lee

235

28 Aug 2023

UNISOUND System for VoxCeleb Speaker Recognition Challenge 2023

140

24 Aug 2023

AdVerb: Visually Guided Audio DereverberationIEEE International Conference on Computer Vision (ICCV), 2023

212

23 Aug 2023

Convoifilter: A case study of doing cocktail party speech recognition

Thai-Binh Nguyen

A. Waibel

243

22 Aug 2023

The DKU-DUKEECE System for the Manipulation Region Location Task of ADD 2023

144

20 Aug 2023

Graph Neural Network Backend for Speaker Recognition

Liang He

Rui Li

Mengqi Niu

168

17 Aug 2023

The DKU-MSXF Speaker Verification System for the VoxCeleb Speaker Recognition Challenge 2023

172

17 Aug 2023

ChinaTelecom System Description to VoxCeleb Speaker Recognition Challenge 2023

Mengjie Du

Xiang Fang

Jie Li

167

16 Aug 2023

SpeechX: Neural Codec Language Model as a Versatile Speech TransformerIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023

315

112

14 Aug 2023

Large-Scale Learning on Overlapped Speech Detection: New Benchmark and New General System

210

11 Aug 2023

Joint speech and overlap detection: a benchmark over multiple audio setup and speech domains

102

24 Jul 2023

Robust Automatic Speech Recognition via WavAugment Guided Phoneme Adversarial TrainingInterspeech (Interspeech), 2023

219

24 Jul 2023

PAS: Partial Additive Speech Data Augmentation Method for Noise Robust Speaker Verification

178

20 Jul 2023

Exploring Binary Classification Loss For Speaker VerificationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

168

17 Jul 2023

Representation Learning With Hidden Unit Clustering For Low Resource Speech ApplicationsIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023

Varun Krishna

T. Sai

Sriram Ganapathy

SSL

161

14 Jul 2023

Self-supervised learning with diffusion-based multichannel speech enhancement for speaker verification under noisy conditionsInterspeech (Interspeech), 2023

236

05 Jul 2023

Pretraining Conformer with ASR or ASV for Anti-Spoofing Countermeasure

Yikang Wang

Hiromitsu Nishizaki

Ming Li

215

04 Jul 2023

An End-to-End Multi-Module Audio Deepfake Generation System for ADD Challenge 2023

148

03 Jul 2023

VoxWatch: An open-set speaker recognition benchmark on VoxCeleb

Raghuveer Peri

S. O. Sadjadi

D. Garcia-Romero

159

30 Jun 2023

The CHiME-7 DASR Challenge: Distant Meeting Transcription with Multiple Devices in Diverse Scenarios

...

Sanjeev Khudanpur

239

23 Jun 2023

MIR-GAN: Refining Frame-Level Modality-Invariant Representations with Adversarial Network for Audio-Visual Speech RecognitionAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

Yuchen Hu

Chen Chen

208

18 Jun 2023

Hearing Lips in Noise: Universal Viseme-Phoneme Mapping and Transfer for Robust Audio-Visual Speech RecognitionAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

Yuchen Hu

223

18 Jun 2023

SURT 2.0: Advances in Transducer-based Multi-talker Speech RecognitionIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023

Desh Raj

Daniel Povey

Sanjeev Khudanpur

VLM

337

18 Jun 2023

CoverHunter: Cover Song Identification with Refined Attention and AlignmentsIEEE International Conference on Multimedia and Expo (ICME), 2023

167

15 Jun 2023

Speaker Verification Across Ages: Investigating Deep Speaker Embedding Sensitivity to Age Mismatch in Enrollment and Test SpeechInterspeech (Interspeech), 2023

Vishwanath Pratap Singh

Md. Sahidullah

Tomi Kinnunen

197

13 Jun 2023