MUSAN: A Music, Speech, and Noise Corpus

28 October 2015

Papers citing "MUSAN: A Music, Speech, and Noise Corpus"

50 / 664 papers shown

Fast Entropy-Based Methods of Word-Level Confidence Estimation for End-To-End Automatic Speech RecognitionSpoken Language Technology Workshop (SLT), 2022

A. Laptev

Boris Ginsburg

206

16 Dec 2022

Leveraging Modality-specific Representations for Audio-visual Speech Recognition via Reinforcement LearningAAAI Conference on Artificial Intelligence (AAAI), 2022

Chen Chen

Yuchen Hu

Qiang Zhang

Heqing Zou

Beier Zhu

Eng Siong Chng

263

10 Dec 2022

GPU-accelerated Guided Source Separation for Meeting TranscriptionInterspeech (Interspeech), 2022

Desh Raj

Daniel Povey

Sanjeev Khudanpur

321

10 Dec 2022

Covariance Regularization for Probabilistic Linear Discriminant AnalysisIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

123

06 Dec 2022

Self-Supervised Audio-Visual Speech Representations Learning By Multimodal Self-DistillationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

Jianqing Gao

225

06 Dec 2022

A General Unfolding Speech Enhancement Method Motivated by Taylor's TheoremIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2022

275

30 Nov 2022

MSV Challenge 2022: NPU-HC Speaker Verification System for Low-resource Indian Languages

Jie Liu

151

30 Nov 2022

TaylorBeamixer: Learning Taylor-Inspired All-Neural Multi-Channel Speech Enhancement from Beam-Space Dictionary PerspectiveInterspeech (Interspeech), 2022

224

22 Nov 2022

VATLM: Visual-Audio-Text Pre-Training with Unified Masked Prediction for Speech Representation LearningIEEE transactions on multimedia (IEEE TMM), 2022

265

21 Nov 2022

Simultaneously Learning Robust Audio Embeddings and balanced Hash codes for Query-by-ExampleIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

Anup Singh

Kris Demuynck

Vipul Arora

100

20 Nov 2022

Speaker Overlap-aware Neural Diarization for Multi-party Meeting AnalysisConference on Empirical Methods in Natural Language Processing (EMNLP), 2022

Zhihao Du

Shiliang Zhang

Siqi Zheng

Zhijie Yan

101

18 Nov 2022

Multi-source Domain Adaptation for Text-independent Forensic Speaker RecognitionIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2022

Zhenyu Wang

John H. L. Hansen

181

17 Nov 2022

Speaker Adaptation for End-To-End Speech Recognition Systems in Noisy EnvironmentsAutomatic Speech Recognition & Understanding (ASRU), 2022

248

16 Nov 2022

The Potential of Neural Speech Synthesis-based Data Augmentation for Personalized Speech EnhancementIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

Anastasia Kuznetsova

Aswin Sivaraman

Minje Kim

209

14 Nov 2022

Towards A Unified Conformer Structure: from ASR to ASV TaskIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

189

14 Nov 2022

Multi-Speaker and Wide-Band Simulated Conversations as Training Data for End-to-End Neural DiarizationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

200

12 Nov 2022

Improving the Robustness of DistilHuBERT to Unseen Noisy Conditions via Data Augmentation, Curriculum Learning, and Multi-Task Enhancement

286

12 Nov 2022

Low Pass Filtering and Bandwidth Extension for Robust Anti-spoofing Countermeasure Against Codec VariabilitiesInternational Symposium on Chinese Spoken Language Processing (ISCSLP), 2022

140

12 Nov 2022

Handling Trade-Offs in Speech Separation with Sparsely-Gated Mixture of Experts

175

11 Nov 2022

High-resolution embedding extractor for speaker diarisationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

210

08 Nov 2022

Late Audio-Visual Fusion for In-The-Wild Speaker Diarization

Zexu Pan

Gordon Wichern

François Germain

Aswin Shanmugam Subramanian

Jonathan Le Roux

VGen

317

02 Nov 2022

data2vec-aqc: Search for the right Teaching Assistant in the Teacher-Student training setupIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

158

02 Nov 2022

I4U System Description for NIST SRE'20 CTS Challenge

Kong Aik Lee

...

Haizhou Li

Alfonso Ortega Giménez

Longbiao Wang

L. Buera

02 Nov 2022

LMD: A Learnable Mask Network to Detect Adversarial Examples for Speaker VerificationIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2022

278

02 Nov 2022

Build a SRE Challenge System: Lessons from VoxSRC 2022 and CNSRC 2022Interspeech (Interspeech), 2022

219

02 Nov 2022

Metric Learning for User-defined Keyword SpottingIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

232

01 Nov 2022

Waveform Boundary Detection for Partially Spoofed AudioIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

Zexin Cai

Weiqing Wang

Ming Li

01 Nov 2022

Model Compression for DNN-based Speaker Verification Using Weight QuantizationInterspeech (Interspeech), 2022

385

31 Oct 2022

Convolution-Based Channel-Frequency Attention for Text-Independent Speaker VerificationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

Jingyu Li

Yusheng Tian

Tan Lee

113

31 Oct 2022

Fast and parallel decoding for transducerIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

Wei Kang

Liyong Guo

Fangjun Kuang

Long Lin

Mingshuang Luo

Zengwei Yao

Xiaoyu Yang

Piotr Żelasko

Daniel Povey

AI4TS

257

31 Oct 2022

Delay-penalized transducer for low-latency streaming ASRIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

Wei Kang

Zengwei Yao

Fangjun Kuang

Liyong Guo

Xiaoyu Yang

Long lin

Piotr Żelasko

Daniel Povey

253

31 Oct 2022

Predicting Multi-Codebook Vector Quantization Indexes for Knowledge DistillationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

Liyong Guo

Xiaoyu Yang

Quandong Wang

Yuxiang Kong

Zengwei Yao

...

Wei Kang

Long Lin

188

31 Oct 2022

Wespeaker: A Research and Production oriented Speaker Embedding Learning ToolkitIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

Shuai Wang

Binbin Zhang

282

194

31 Oct 2022

SRTNet: Time Domain Speech Enhancement Via Stochastic RefinementIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

264

30 Oct 2022

Adaptive Speech Quality Aware Complex Neural Network for Acoustic Echo Cancellation with Supervised Contrastive Learning

Bozhong Liu

Xiaoxi Yu

Hantao Huang

254

30 Oct 2022

Speaker Representation Learning via Contrastive Loss with Maximal Speaker SeparabilityAsia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 2022

Zhe Li

Man-Wai Mak

SSL

263

29 Oct 2022

Target-Speaker Voice Activity Detection via Sequence-to-Sequence PredictionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

336

28 Oct 2022

A comprehensive study on self-supervised distillation for speaker representation learningSpoken Language Technology Workshop (SLT), 2022

345

28 Oct 2022

Speaker recognition with two-step multi-modal deep cleansingIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

Ruijie Tao

Kong Aik Lee

Zhan Shi

Haizhou Li

NoLa

139

28 Oct 2022

Self-Supervised Training of Speaker Encoder with Multi-Modal Diverse Positive PairsIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2022

Ruijie Tao

Kong Aik Lee

Rohan Kumar Das

Ville Hautamaki

Haizhou Li

SSL

216

27 Oct 2022

Robust Data2vec: Noise-robust Speech Representation Learning for ASR by Combining Regression and Improved Contrastive LearningIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

Yu-Chen Hu

179

27 Oct 2022

TSUP Speaker Diarization System for Conversational Short-phrase Speaker Diarization ChallengeInternational Symposium on Chinese Spoken Language Processing (ISCSLP), 2022

Xiaoyue Yang

155

26 Oct 2022

Speaker Diarization Based on Multi-channel Microphone Array in Small-scale Meeting

Yu Du

R. Zhou

26 Oct 2022

Improving Speech-to-Speech Translation Through Unlabeled TextIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

196

26 Oct 2022

Large-scale learning of generalised representations for speaker recognition

Hye-jin Shim

206

20 Oct 2022

How to Leverage DNN-based speech enhancement for multi-channel speaker verification?

Sandipana Dowerah

Romain Serizel

D. Jouvet

Mohammad MohammadAmini

D. Matrouf

150

17 Oct 2022

spatial-dccrn: dccrn equipped with frame-level angle feature and hybrid filtering for multi-channel speech enhancementSpoken Language Technology Workshop (SLT), 2022

154

17 Oct 2022

Attention-Based Audio Embeddings for Query-by-ExampleInternational Society for Music Information Retrieval Conference (ISMIR), 2022

Anup Singh

Kris Demuynck

Vipul Arora

107

16 Oct 2022

Improving generalizability of distilled self-supervised speech processing models under distorted settingsSpoken Language Technology Workshop (SLT), 2022

Kuan-Po Huang

Yu-Kuan Fu

Tsung-Yuan Hsu

Fabian Ritter-Gutierrez

247

14 Oct 2022

Description and analysis of novelties introduced in DCASE Task 4 2022 on the baseline systemWorkshop on Detection and Classification of Acoustic Scenes and Events (DCASE), 2022

161

14 Oct 2022