MUSAN: A Music, Speech, and Noise Corpus

28 October 2015

Papers citing "MUSAN: A Music, Speech, and Noise Corpus"

50 / 664 papers shown

GraFPrint: A GNN-Based Approach for Audio IdentificationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024

Aditya Bhattacharjee

Shubhr Singh

Emmanouil Benetos

253

14 Oct 2024

The First VoicePrivacy Attacker Challenge Evaluation PlanIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024

434

09 Oct 2024

Mamba-based Segmentation Model for Speaker DiarizationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024

Marc Delcroix

Shoko Araki

236

09 Oct 2024

LS-EEND: Long-Form Streaming End-to-End Neural Diarization with Online Attractor ExtractionIEEE Transactions on Audio, Speech, and Language Processing (TASLP), 2024

Di Liang

Xiaofei Li

341

09 Oct 2024

Improving Speaker Representations Using Contrastive Losses on Multi-scale Features

Satvik Dixit

Massa Baali

Rita Singh

Bhiksha Raj

318

07 Oct 2024

Analyzing and Mitigating Inconsistency in Discrete Audio Tokens for Neural Codec Language Models

Jin Xu

213

28 Sep 2024

Incorporating Spatial Cues in Modular Speaker Diarization for Multi-channel Multi-party MeetingsIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024

Jia Pan

298

25 Sep 2024

MT2KD: Towards A General-Purpose Encoder for Speech, Speaker, and Audio EventsIEEE Transactions on Audio, Speech, and Language Processing (TASLP), 2024

471

25 Sep 2024

Disentangling Age and Identity with a Mutual Information Minimization Approach for Cross-Age Speaker Verification

Fengrun Zhang

Wang Geng

213

24 Sep 2024

WeSep: A Scalable and Flexible Toolkit Towards Generalizable Target Speaker ExtractionInterspeech (Interspeech), 2024

Shuai Wang

Ke Zhang

Shaoxiong Lin

Junjie Li

Xuefei Wang

Meng Ge

Jianwei Yu

Yanmin Qian

Haizhou Li

190

24 Sep 2024

M-Vec: Matryoshka Speaker Embeddings with Flexible Dimensions

Shuai Wang

Pengcheng Zhu

Haizhou Li

177

24 Sep 2024

CA-MHFA: A Context-Aware Multi-Head Factorized Attentive Pooling for SSL-Based Speaker Verification

Lukáš Burget

Jan Černocký

165

23 Sep 2024

Learning Source Disentanglement in Neural Audio CodecIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024

Xiaoyu Bie

Xubo Liu

Gaël Richard

233

17 Sep 2024

Speaker-IPL: Unsupervised Learning of Speaker Characteristics with i-Vector based Pseudo-LabelsIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024

Ahmed Hussen Abdelaziz

Shinji Watanabe

Tatiana Likhomanenko

B. Theobald

VLM SSL

235

16 Sep 2024

Optimizing Dysarthria Wake-Up Word Spotting: An End-to-End Approach for SLT 2024 LRDWWS ChallengeSpoken Language Technology Workshop (SLT), 2024

Yujun Wang

Lei Xie

295

16 Sep 2024

Speaker Contrastive Learning for Source Speaker TracingSpoken Language Technology Workshop (SLT), 2024

Xiao-Lei Zhang

288

16 Sep 2024

On the effectiveness of enrollment speech augmentation for Target Speaker ExtractionSpoken Language Technology Workshop (SLT), 2024

Junjie Li

Ke Zhang

Shuai Wang

Haizhou Li

Man-Wai Mak

Kong Aik Lee

143

15 Sep 2024

Multi-modal Speech Transformer Decoders: When Do Multiple Modalities Improve Accuracy?

Yiwen Guan

V. Trinh

Vivek Voleti

Jacob Whitehill

279

13 Sep 2024

Early Joint Learning of Emotion Information Makes MultiModal Model Understand You Better

Tao Zhang

226

12 Sep 2024

Spoofing-Aware Speaker Verification Robust Against Domain and Channel MismatchesSpoken Language Technology Workshop (SLT), 2024

Xin Wang

189

10 Sep 2024

Findings of the 2024 Mandarin Stuttering Event Detection and Automatic Speech Recognition ChallengeSpoken Language Technology Workshop (SLT), 2024

Hongfei Xue

Rong Gong

Mingchen Shao

Xin Xu

L. xilinx Wang

...

Yong Qin

Jun Du

Ming Li

Binbin Zhang

Bin Jia

182

09 Sep 2024

The USTC-NERCSLIP Systems for the CHiME-8 NOTSOFAR-1 Challenge

...

Jia Pan

Jianqing Gao

309

03 Sep 2024

USTC-KXDIGIT System Description for ASVspoof5 Challenge

...

Lin Liu

214

03 Sep 2024

Resource-Efficient Adaptation of Speech Foundation Models for Multi-Speaker ASRSpoken Language Technology Workshop (SLT), 2024

Weiqing Wang

Kunal Dhawan

Taejin Park

Jagadeesh Balam

Boris Ginsburg

226

02 Sep 2024

The VoxCeleb Speaker Recognition Challenge: A RetrospectiveIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2024

273

27 Aug 2024

A Preliminary Case Study on Long-Form In-the-Wild Audio Spoofing DetectionBiometrics and Electronic Signatures (BES), 2024

Xuechen Liu

Xin Wang

Junichi Yamagishi

169

26 Aug 2024

NEST: Self-supervised Fast Conformer as All-purpose Seasoning to Speech Processing TasksIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024

He Huang

Taejin Park

Kunal Dhawan

Jagadeesh Balam

Boris Ginsburg

SSL AI4TS

320

23 Aug 2024

BUT Systems and Analyses for the ASVspoof 5 Challenge

Johan Rohdin

Lin Zhang

Oldřich Plchot

Vojtěch Staněk

...

Lukáš Burget

180

20 Aug 2024

Temporal Variability and Multi-Viewed Self-Supervised Representations to Tackle the ASVspoof5 Deepfake Challenge

Yuankun Xie

Xiaopeng Wang

193

13 Aug 2024

ADD 2023: Towards Audio Deepfake Detection and Analysis in the Wild

Junzuo Zhou

266

09 Aug 2024

Language Model Can Listen While SpeakingAAAI Conference on Artificial Intelligence (AAAI), 2024

Yakun Song

Zhuo Chen

259

05 Aug 2024

Whisper-SV: Adapting Whisper for Low-data-resource Speaker Verification

Lei Xie

229

14 Jul 2024

A Benchmark for Multi-speaker Anonymization

Xiaoxiao Miao

Ruijie Tao

Chang Zeng

Xin Wang

302

08 Jul 2024

WildDESED: An LLM-Powered Dataset for Wild Domestic Environment Sound Event Detection System

Yang Xiao

Rohan Kumar Das

222

04 Jul 2024

Learning Video Temporal Dynamics with Cross-Modal Attention for Robust Audio-Visual Speech Recognition

Sungnyun Kim

Kangwook Jang

Sangmin Bae

Hoirin Kim

Se-Young Yun

238

04 Jul 2024

GMM-ResNext: Combining Generative and Discriminative Models for Speaker Verification

166

03 Jul 2024

Leveraging Speaker Embeddings in End-to-End Neural Diarization for Two-Speaker Scenarios

Juan Ignacio Alvarez-Trejos

Beltrán Labrador

Alicia Lozano-Diez

350

01 Jul 2024

Are you sure? Analysing Uncertainty Quantification Approaches for Real-world Speech Emotion Recognition

Björn Schuller

190

01 Jul 2024

FMSG-JLESS Submission for DCASE 2024 Task4 on Sound Event Detection with Heterogeneous Training Dataset and Potentially Missing Labels

215

29 Jun 2024

Speakers Unembedded: Embedding-free Approach to Long-form Neural Diarization

152

26 Jun 2024

A Comprehensive Solution to Connect Speech Encoder and Large Language Model for ASR

Van Tung Pham

Yist Y. Lin

Tao Han

Wei Li

Jun Zhang

Lu Lu

Yuxuan Wang

AuLLM

163

25 Jun 2024

Disentangled Representation Learning for Environment-agnostic Speaker Recognition

231

20 Jun 2024

CEC: A Noisy Label Detection Method for Speaker RecognitionInterspeech (Interspeech), 2024

133

19 Jun 2024

Self-Distillation Prototypes Network: Learning Robust Speaker Representations without Supervision

Yafeng Chen

Siqi Zheng

Hui Wang

Luyao Cheng

Qian Chen

Shiliang Zhang

Wen Wang

SSL

140

17 Jun 2024

Robust Channel Learning for Large-Scale Radio Speaker Verification

208

16 Jun 2024

Double Multi-Head Attention Multimodal System for Odyssey 2024 Speech Emotion Recognition ChallengeThe Speaker and Language Recognition Workshop (Odyssey), 2024

Federico Costa

Miquel India

Javier Hernando

232

15 Jun 2024

SOA: Reducing Domain Mismatch in SSL Pipeline by Speech Only Adaptation for Low Resource ASR

Natarajan Balaji Shankar

Ruchao Fan

Abeer Alwan

245

15 Jun 2024

Exploring Spoken Language Identification Strategies for Automatic Transcription of Multilingual Broadcast and Institutional Speech

173

13 Jun 2024

DCASE 2024 Task 4: Sound Event Detection with Heterogeneous Data and Missing Labels

Romain Serizel

192

12 Jun 2024

Comparative Analysis of Personalized Voice Activity Detection Systems: Assessing Real-World Effectiveness

Rudovic

Ahmed Hussen Abdelaziz

Saurabh N. Adya

200

12 Jun 2024