ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1510.08484
  4. Cited By
MUSAN: A Music, Speech, and Noise Corpus

MUSAN: A Music, Speech, and Noise Corpus

28 October 2015
David Snyder
Guoguo Chen
Daniel Povey
ArXiv (abs)PDFHTML

Papers citing "MUSAN: A Music, Speech, and Noise Corpus"

50 / 664 papers shown
Title
A Comprehensive Investigation on Speaker Augmentation for Speaker
  Recognition
A Comprehensive Investigation on Speaker Augmentation for Speaker Recognition
Zhenyu Zhou
Shibiao Xu
Shi Yin
Lantian Li
D. Wang
137
5
0
11 Jun 2024
MR-RawNet: Speaker verification system with multiple temporal
  resolutions for variable duration utterances using raw waveforms
MR-RawNet: Speaker verification system with multiple temporal resolutions for variable duration utterances using raw waveforms
Seung-bin Kim
Chan-yeong Lim
Jungwoo Heo
Ju-ho Kim
Hyun-Seo Shin
Kyo-Won Koo
Ha-Jin Yu
273
3
0
11 Jun 2024
MaLa-ASR: Multimedia-Assisted LLM-Based ASR
MaLa-ASR: Multimedia-Assisted LLM-Based ASR
Guanrou Yang
Ziyang Ma
Fan Yu
Zhifu Gao
Shiliang Zhang
Xie Chen
AuLLM
315
5
0
09 Jun 2024
An Investigation of Noise Robustness for Flow-Matching-Based Zero-Shot
  TTS
An Investigation of Noise Robustness for Flow-Matching-Based Zero-Shot TTSInterspeech (Interspeech), 2024
Xiaofei Wang
Sefik Emre Eskimez
Manthan Thakker
Hemin Yang
Zirun Zhu
...
Yufei Xia
Jinzhu Li
Sheng Zhao
Jinyu Li
Naoyuki Kanda
142
6
0
09 Jun 2024
DAISY: Data Adaptive Self-Supervised Early Exit for Speech
  Representation Models
DAISY: Data Adaptive Self-Supervised Early Exit for Speech Representation ModelsInterspeech (Interspeech), 2024
Tzu-Quan Lin
Hung-yi Lee
Hao Tang
273
4
0
08 Jun 2024
Towards Lightweight Speaker Verification via Adaptive Neural Network
  Quantization
Towards Lightweight Speaker Verification via Adaptive Neural Network QuantizationIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2024
Bei Liu
Haoyu Wang
Yanmin Qian
MQ
384
3
0
08 Jun 2024
To what extent can ASV systems naturally defend against spoofing
  attacks?
To what extent can ASV systems naturally defend against spoofing attacks?Interspeech (Interspeech), 2024
Jee-weon Jung
Xin Eric Wang
Nicholas W. D. Evans
Shinji Watanabe
Hye-jin Shim
Hemlata Tak
Sidhhant Arora
Junichi Yamagishi
Joon Son Chung
AAML
203
10
0
08 Jun 2024
Relational Proxy Loss for Audio-Text based Keyword Spotting
Relational Proxy Loss for Audio-Text based Keyword SpottingInterspeech (Interspeech), 2024
Youngmoon Jung
Seungjin Lee
Joon-Young Yang
Jaeyoung Roh
Chang Woo Han
Hoon-Young Cho
156
3
0
08 Jun 2024
Generalized Source Tracing: Detecting Novel Audio Deepfake Algorithm
  with Real Emphasis and Fake Dispersion Strategy
Generalized Source Tracing: Detecting Novel Audio Deepfake Algorithm with Real Emphasis and Fake Dispersion Strategy
Yuankun Xie
Ruibo Fu
Zhengqi Wen
Zhiyong Wang
Xiaopeng Wang
Haonnan Cheng
Long Ye
Jianhua Tao
333
12
0
05 Jun 2024
Towards Supervised Performance on Speaker Verification with
  Self-Supervised Learning by Leveraging Large-Scale ASR Models
Towards Supervised Performance on Speaker Verification with Self-Supervised Learning by Leveraging Large-Scale ASR Models
Victor Miara
Theo Lepage
Reda Dehak
221
7
0
04 Jun 2024
Mamba in Speech: Towards an Alternative to Self-Attention
Mamba in Speech: Towards an Alternative to Self-Attention
Xiangyu Zhang
Qiquan Zhang
Hexin Liu
Tianyi Xiao
Xinyuan Qian
Beena Ahmed
E. Ambikairajah
Haizhou Li
Julien Epps
Mamba
378
91
0
21 May 2024
Neighborhood Attention Transformer with Progressive Channel Fusion for
  Speaker Verification
Neighborhood Attention Transformer with Progressive Channel Fusion for Speaker Verification
Nian Li
Jianguo Wei
ViT
246
0
0
20 May 2024
Robust Singing Voice Transcription Serves Synthesis
Robust Singing Voice Transcription Serves SynthesisAnnual Meeting of the Association for Computational Linguistics (ACL), 2024
Ruiqi Li
Yu Zhang
Yongqi Wang
Zhiqing Hong
Rongjie Huang
Zhou Zhao
308
16
0
16 May 2024
Speaker Embeddings With Weakly Supervised Voice Activity Detection For
  Efficient Speaker Diarization
Speaker Embeddings With Weakly Supervised Voice Activity Detection For Efficient Speaker DiarizationThe Speaker and Language Recognition Workshop (Odyssey), 2024
Jenthe Thienpondt
Kris Demuynck
180
3
0
15 May 2024
Audio-Visual Target Speaker Extraction with Reverse Selective Auditory Attention
Audio-Visual Target Speaker Extraction with Reverse Selective Auditory Attention
Ruijie Tao
Xinyuan Qian
Yidi Jiang
Junjie Li
Jiadong Wang
Haizhou Li
319
2
0
29 Apr 2024
MER 2024: Semi-Supervised Learning, Noise Robustness, and
  Open-Vocabulary Multimodal Emotion Recognition
MER 2024: Semi-Supervised Learning, Noise Robustness, and Open-Vocabulary Multimodal Emotion Recognition
Zheng Lian
Haiyang Sun
Guoying Zhao
Zhuofan Wen
Siyuan Zhang
...
Yinan Han
Xiaoshi Zhong
Guoying Zhao
Björn W. Schuller
Jianhua Tao
VLM
366
33
0
26 Apr 2024
TRNet: Two-level Refinement Network leveraging Speech Enhancement for
  Noise Robust Speech Emotion Recognition
TRNet: Two-level Refinement Network leveraging Speech Enhancement for Noise Robust Speech Emotion Recognition
Chengxin Chen
Pengyuan Zhang
205
4
0
19 Apr 2024
A Large-Scale Evaluation of Speech Foundation Models
A Large-Scale Evaluation of Speech Foundation Models
Shu-Wen Yang
Heng-Jui Chang
Zili Huang
Andy T. Liu
Cheng-I Jeff Lai
...
Kushal Lakhotia
Shang-Wen Li
Abdelrahman Mohamed
Shinji Watanabe
Hung-yi Lee
272
55
0
15 Apr 2024
What is Learnt by the LEArnable Front-end (LEAF)? Adapting Per-Channel
  Energy Normalisation (PCEN) to Noisy Conditions
What is Learnt by the LEArnable Front-end (LEAF)? Adapting Per-Channel Energy Normalisation (PCEN) to Noisy ConditionsInterspeech (Interspeech), 2023
Hanyu Meng
V. Sethu
E. Ambikairajah
230
3
0
10 Apr 2024
The VoicePrivacy 2024 Challenge Evaluation Plan
The VoicePrivacy 2024 Challenge Evaluation Plan
N. Tomashenko
Xiaoxiao Miao
Pierre Champion
Sarina Meyer
Xin Wang
Emmanuel Vincent
Michele Panariello
Nicholas W. D. Evans
Junichi Yamagishi
Massimiliano Todisco
280
58
0
03 Apr 2024
Maximum Discrepancy Generative Regularization and Non-Negative Matrix
  Factorization for Single Channel Source Separation
Maximum Discrepancy Generative Regularization and Non-Negative Matrix Factorization for Single Channel Source Separation
Martin Ludvigsen
M. Grasmair
142
0
0
26 Mar 2024
XLAVS-R: Cross-Lingual Audio-Visual Speech Representation Learning for
  Noise-Robust Speech Perception
XLAVS-R: Cross-Lingual Audio-Visual Speech Representation Learning for Noise-Robust Speech Perception
HyoJung Han
Mohamed Anwar
J. Pino
Wei-Ning Hsu
Marine Carpuat
Bowen Shi
Changhan Wang
VLM
246
15
0
21 Mar 2024
Speech-Aware Neural Diarization with Encoder-Decoder Attractor Guided by
  Attention Constraints
Speech-Aware Neural Diarization with Encoder-Decoder Attractor Guided by Attention Constraints
PeiYing Lee
HauYun Guo
Berlin Chen
183
0
0
21 Mar 2024
An Efficient End-to-End Approach to Noise Invariant Speech Features via
  Multi-Task Learning
An Efficient End-to-End Approach to Noise Invariant Speech Features via Multi-Task Learning
Heitor R. Guimarães
Arthur Pimentel
Anderson R. Avila
Mehdi Rezagholizadeh
Boxing Chen
Tiago H. Falk
256
1
0
13 Mar 2024
Speech Robust Bench: A Robustness Benchmark For Speech Recognition
Speech Robust Bench: A Robustness Benchmark For Speech RecognitionInternational Conference on Learning Representations (ICLR), 2024
Muhammad A. Shah
David Solans Noguero
Mikko A. Heikkilä
Nicolas Kourtellis
228
12
0
08 Mar 2024
Exploration of Adapter for Noise Robust Automatic Speech Recognition
Exploration of Adapter for Noise Robust Automatic Speech Recognition
Hao Shi
Tatsuya Kawahara
258
6
0
28 Feb 2024
ChildAugment: Data Augmentation Methods for Zero-Resource Children's
  Speaker Verification
ChildAugment: Data Augmentation Methods for Zero-Resource Children's Speaker Verification
Vishwanath Pratap Singh
Md. Sahidullah
Tomi Kinnunen
123
10
0
23 Feb 2024
It's Never Too Late: Fusing Acoustic Information into Large Language
  Models for Automatic Speech Recognition
It's Never Too Late: Fusing Acoustic Information into Large Language Models for Automatic Speech Recognition
Chen Chen
Ruizhe Li
Yuchen Hu
Sabato Marco Siniscalchi
Pin-Yu Chen
Ensiong Chng
Chao-Han Huck Yang
223
32
0
08 Feb 2024
Adversarial Data Augmentation for Robust Speaker Verification
Adversarial Data Augmentation for Robust Speaker Verification
Zhenyu Zhou
Junhui Chen
Namin Wang
Lantian Li
Dong Wang
212
6
0
05 Feb 2024
Music Auto-Tagging with Robust Music Representation Learned via Domain
  Adversarial Training
Music Auto-Tagging with Robust Music Representation Learned via Domain Adversarial TrainingIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Haesun Joung
Kyogu Lee
174
1
0
27 Jan 2024
Adversarial speech for voice privacy protection from Personalized Speech
  generation
Adversarial speech for voice privacy protection from Personalized Speech generationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Shihao Chen
Liping Chen
Jie Zhang
KongAik Lee
Zhenhua Ling
Lirong Dai
AAML
209
10
0
22 Jan 2024
An Empirical Study on the Impact of Positional Encoding in
  Transformer-based Monaural Speech Enhancement
An Empirical Study on the Impact of Positional Encoding in Transformer-based Monaural Speech Enhancement
Qiquan Zhang
Meng Ge
Hongxu Zhu
E. Ambikairajah
Qi Song
Zhaoheng Ni
Haizhou Li
233
15
0
18 Jan 2024
MLAAD: The Multi-Language Audio Anti-Spoofing Dataset
MLAAD: The Multi-Language Audio Anti-Spoofing DatasetIEEE International Joint Conference on Neural Network (IJCNN), 2024
Nicolas Müller
Piotr Kawa
Wei Herng Choong
Edresson Casanova
Eren Golge
Thorsten Muller
P. Syga
Philip Sperl
Konstantin Böttinger
376
97
0
17 Jan 2024
Noise-robust zero-shot text-to-speech synthesis conditioned on
  self-supervised speech-representation model with adapters
Noise-robust zero-shot text-to-speech synthesis conditioned on self-supervised speech-representation model with adaptersIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Kenichi Fujita
Hiroshi Sato
Takanori Ashihara
Hiroki Kanagawa
Marc Delcroix
Takafumi Moriya
Yusuke Ijima
125
14
0
10 Jan 2024
MERBench: A Unified Evaluation Benchmark for Multimodal Emotion
  Recognition
MERBench: A Unified Evaluation Benchmark for Multimodal Emotion Recognition
Zheng Lian
Guoying Zhao
Yong Ren
Hao Gu
Haiyang Sun
Lan Chen
Yinan Han
Jianhua Tao
398
26
0
07 Jan 2024
MLCA-AVSR: Multi-Layer Cross Attention Fusion based Audio-Visual Speech
  Recognition
MLCA-AVSR: Multi-Layer Cross Attention Fusion based Audio-Visual Speech Recognition
He Wang
Pengcheng Guo
Pan Zhou
Lei Xie
379
17
0
07 Jan 2024
Gradient weighting for speaker verification in extremely low
  Signal-to-Noise Ratio
Gradient weighting for speaker verification in extremely low Signal-to-Noise Ratio
Yi Ma
Kong Aik Lee
Ville Hautamaki
Meng Ge
Haizhou Li
134
1
0
05 Jan 2024
Self-supervised Reflective Learning through Self-distillation and Online Clustering for Speaker Representation Learning
Self-supervised Reflective Learning through Self-distillation and Online Clustering for Speaker Representation LearningIEEE Transactions on Audio, Speech, and Language Processing (IEEE TASLP), 2024
Danwei Cai
Zexin Cai
Ze Li
Ming Li
284
2
0
03 Jan 2024
Self-Supervised Adaptive AV Fusion Module for Pre-Trained ASR Models
Self-Supervised Adaptive AV Fusion Module for Pre-Trained ASR Models
Christopher Simic
Tobias Bocklet
221
10
0
21 Dec 2023
Noise robust distillation of self-supervised speech models via
  correlation metrics
Noise robust distillation of self-supervised speech models via correlation metrics
Fabian Ritter-Gutierrez
Kuan-Po Huang
Dianwen Ng
Jeremy H.M Wong
Hung-yi Lee
Chng Eng Siong
Nancy F. Chen
284
4
0
19 Dec 2023
NeXt-TDNN: Modernizing Multi-Scale Temporal Convolution Backbone for
  Speaker Verification
NeXt-TDNN: Modernizing Multi-Scale Temporal Convolution Backbone for Speaker VerificationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Hyunjun Heo
U.H Shin
Ran Lee
YoungJu Cheon
Hyung-Min Park
215
26
0
14 Dec 2023
Robust End-to-End Diarization with Domain Adaptive Training and
  Multi-Task Learning
Robust End-to-End Diarization with Domain Adaptive Training and Multi-Task LearningAutomatic Speech Recognition & Understanding (ASRU), 2023
Ivan Fung
Lahiru Samarakoon
Samuel J. Broughton
OOD
252
2
0
12 Dec 2023
Testing Correctness, Fairness, and Robustness of Speech Emotion Recognition Models
Testing Correctness, Fairness, and Robustness of Speech Emotion Recognition Models
Anna Derington
H. Wierstorf
Ali Özkil
F. Eyben
Felix Burkhardt
Björn W. Schuller
354
2
0
11 Dec 2023
DiaPer: End-to-End Neural Diarization with Perceiver-Based Attractors
DiaPer: End-to-End Neural Diarization with Perceiver-Based Attractors
Federico Landini
Mireia Díez
Themos Stafylakis
Lukávs Burget
316
20
0
07 Dec 2023
AV2AV: Direct Audio-Visual Speech to Audio-Visual Speech Translation
  with Unified Audio-Visual Speech Representation
AV2AV: Direct Audio-Visual Speech to Audio-Visual Speech Translation with Unified Audio-Visual Speech RepresentationComputer Vision and Pattern Recognition (CVPR), 2023
J. Choi
Se Jin Park
Minsu Kim
Y. Ro
359
16
0
05 Dec 2023
Phonetic-aware speaker embedding for far-field speaker verification
Phonetic-aware speaker embedding for far-field speaker verification
Zezhong Jin
Youzhi Tu
Man-Wai Mak
196
2
0
27 Nov 2023
Summary of the DISPLACE Challenge 2023 - DIarization of SPeaker and
  LAnguage in Conversational Environments
Summary of the DISPLACE Challenge 2023 - DIarization of SPeaker and LAnguage in Conversational Environments
Shikha Baghel
Shreyas Ramoji
Somil Jain
Pratik Roy Chowdhuri
Prachi Singh
Deepu Vijayasenan
Sriram Ganapathy
174
10
0
21 Nov 2023
DINO-VITS: Data-Efficient Zero-Shot TTS with Self-Supervised Speaker
  Verification Loss for Noise Robustness
DINO-VITS: Data-Efficient Zero-Shot TTS with Self-Supervised Speaker Verification Loss for Noise Robustness
Vikentii Pankov
Valeria Pronina
Alexander Kuzmin
Maksim Borisov
Nikita Usoltsev
Xingshan Zeng
Alexander Golubkov
Nikolai Ermolenko
Aleksandra Shirshova
Yulia Matveeva
159
6
0
16 Nov 2023
R-Spin: Efficient Speaker and Noise-invariant Representation Learning
  with Acoustic Pieces
R-Spin: Efficient Speaker and Noise-invariant Representation Learning with Acoustic PiecesNorth American Chapter of the Association for Computational Linguistics (NAACL), 2023
Heng-Jui Chang
James R. Glass
230
7
0
15 Nov 2023
TorchAudio 2.1: Advancing speech recognition, self-supervised learning,
  and audio processing components for PyTorch
TorchAudio 2.1: Advancing speech recognition, self-supervised learning, and audio processing components for PyTorchAutomatic Speech Recognition & Understanding (ASRU), 2023
Jeff Hwang
Moto Hira
Caroline Chen
Xiaohui Zhang
Zhaoheng Ni
...
Yumeng Tao
Robin Scheibler
Samuele Cornell
Sean Kim
Stavros Petridis
244
35
0
27 Oct 2023
Previous
12345...121314
Next