Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1510.08484
Cited By
MUSAN: A Music, Speech, and Noise Corpus
28 October 2015
David Snyder
Guoguo Chen
Daniel Povey
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"MUSAN: A Music, Speech, and Noise Corpus"
50 / 664 papers shown
OpenSR: Open-Modality Speech Recognition via Maintaining Multi-Modality Alignment
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Xize Cheng
Tao Jin
Lin Li
Wang Lin
Xinyu Duan
Zhou Zhao
VLM
266
20
0
10 Jun 2023
Meta-Learning Framework for End-to-End Imposter Identification in Unseen Speaker Recognition
Automatic Speech Recognition & Understanding (ASRU), 2023
Ashutosh Chaubey
Sparsh Sinha
Susmita Ghose
235
1
0
01 Jun 2023
A Teacher-Student approach for extracting informative speaker embeddings from speech mixtures
Interspeech (Interspeech), 2023
Tobias Cord-Landwehr
Christoph Boeddeker
Catalin Zorila
R. Doddipatla
Reinhold Haeb-Umbach
317
5
0
01 Jun 2023
Make-A-Voice: Unified Voice Synthesis With Discrete Representation
Rongjie Huang
Chunlei Zhang
Yongqiang Wang
Dongchao Yang
Lu Liu
Zhenhui Ye
Ziyue Jiang
Chao Weng
Zhou Zhao
Dong Yu
DiffM
178
34
0
30 May 2023
Improving Textless Spoken Language Understanding with Discrete Units as Intermediate Target
Interspeech (Interspeech), 2023
Guanyong Wu
Guan-Ting Lin
Shang-Wen Li
Hung-yi Lee
220
6
0
29 May 2023
One-Step Knowledge Distillation and Fine-Tuning in Using Large Pre-Trained Self-Supervised Learning Models for Speaker Verification
Interspeech (Interspeech), 2023
Ju-Sung Heo
Chan-yeong Lim
Ju-ho Kim
Hyun-Seo Shin
Ha-Jin Yu
249
6
0
27 May 2023
DistriBlock: Identifying adversarial audio samples by leveraging characteristics of the output distribution
Conference on Uncertainty in Artificial Intelligence (UAI), 2023
Matías P. Pizarro
D. Kolossa
Asja Fischer
AAML
496
2
0
26 May 2023
Visualizing data augmentation in deep speaker recognition
Interspeech (Interspeech), 2023
Pengqi Li
Lantian Li
A. Hamdulla
D. Wang
127
4
0
25 May 2023
AV-TranSpeech: Audio-Visual Robust Speech-to-Speech Translation
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Rongjie Huang
Huadai Liu
Xize Cheng
Yi Ren
Lin Li
...
Jinzheng He
Lichao Zhang
Jinglin Liu
Xiaoyue Yin
Zhou Zhao
209
10
0
24 May 2023
P-vectors: A Parallel-Coupled TDNN/Transformer Network for Speaker Verification
Interspeech (Interspeech), 2023
Xiyuan Wang
Fangyuan Wang
Bo Xu
Liang Xu
Jing Xiao
214
6
0
24 May 2023
Multi-Stream Extension of Variational Bayesian HMM Clustering (MS-VBx) for Combined End-to-End and Vector Clustering-based Diarization
Interspeech (Interspeech), 2023
Marc Delcroix
Naohiro Tawara
Mireia Díez
Federico Landini
Anna Silnova
A. Ogawa
Tomohiro Nakatani
L. Burget
S. Araki
160
7
0
23 May 2023
An Enhanced Res2Net with Local and Global Feature Fusion for Speaker Verification
Interspeech (Interspeech), 2023
Yafeng Chen
Siqi Zheng
Haibo Wang
Luyao Cheng
Qian Chen
Jiajun Qi
187
63
0
22 May 2023
Progressive Sub-Graph Clustering Algorithm for Semi-Supervised Domain Adaptation Speaker Verification
International Conference on Signal Processing, Communications and Computing (ICSPCC), 2023
Zhuo Li
Jingze Lu
Z. Zhao
Wenchao Wang
Pengyuan Zhang
153
1
0
22 May 2023
The HCCL system for VoxCeleb Speaker Recognition Challenge 2022
Zhenduo Zhao
Zhuo Li
Wenchao Wang
Pengyuan Zhang
111
4
0
22 May 2023
On the Efficacy and Noise-Robustness of Jointly Learned Speech Emotion and Automatic Speech Recognition
Interspeech (Interspeech), 2023
L. Bansal
S. P. Dubagunta
Malolan Chetlur
Pushpak Jagtap
A. Ganapathiraju
180
1
0
21 May 2023
Towards Robust Family-Infant Audio Analysis Based on Unsupervised Pretraining of Wav2vec 2.0 on Large-Scale Unlabeled Family Audio
Interspeech (Interspeech), 2023
Jialu Li
M. Hasegawa-Johnson
Nancy L. McElwain
258
14
0
21 May 2023
Blank-regularized CTC for Frame Skipping in Neural Transducer
Interspeech (Interspeech), 2023
Yifan Yang
Xiaoyu Yang
Liyong Guo
Zengwei Yao
Wei Kang
Fangjun Kuang
Long Lin
Xie Chen
Daniel Povey
136
11
0
19 May 2023
Cross-Modal Global Interaction and Local Alignment for Audio-Visual Speech Recognition
International Joint Conference on Artificial Intelligence (IJCAI), 2023
Yuchen Hu
Ruizhe Li
Chen Chen
Heqing Zou
Qiu-shi Zhu
Eng Siong Chng
212
14
0
16 May 2023
Ripple sparse self-attention for monaural speech enhancement
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Qiquan Zhang
Hongxu Zhu
Qi Song
Xinyuan Qian
Zhaoheng Ni
Haizhou Li
100
9
0
15 May 2023
Deep Audio-Visual Singing Voice Transcription based on Self-Supervised Learning Models
Xiangming Gu
Weizhen Zeng
Jianan Zhang
Longshen Ou
Ye Wang
246
6
0
24 Apr 2023
Multi-channel Speech Separation Using Spatially Selective Deep Non-linear Filters
IEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023
Kristina Tesch
Timo Gerkmann
168
34
0
24 Apr 2023
MER 2023: Multi-label Learning, Modality Robustness, and Semi-Supervised Learning
ACM Multimedia (ACM MM), 2023
Zheng Lian
Haiyang Sun
Guoying Zhao
Kang Chen
Mingyu Xu
...
Meng Wang
Xiaoshi Zhong
Guoying Zhao
Björn W. Schuller
Jianhua Tao
269
81
0
18 Apr 2023
Fast Random Approximation of Multi-channel Room Impulse Response
Yi Luo
Rongzhi Gu
204
8
0
17 Apr 2023
Efficient Sequence Transduction by Jointly Predicting Tokens and Durations
International Conference on Machine Learning (ICML), 2023
Hainan Xu
Fei Jia
Somshubra Majumdar
Hengguan Huang
Shinji Watanabe
Boris Ginsburg
180
44
0
13 Apr 2023
Self-Supervised Learning with Cluster-Aware-DINO for High-Performance Robust Speaker Verification
IEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023
Bing Han
Zhengyang Chen
Y. Qian
143
36
0
12 Apr 2023
Margin-Mixup: A Method for Robust Speaker Verification in Multi-Speaker Audio
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Jenthe Thienpondt
N. Madhu
Kris Demuynck
124
7
0
07 Apr 2023
To Wake-up or Not to Wake-up: Reducing Keyword False Alarm by Successive Refinement
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Yashas Malur Saidutta
R. S. Srinivasa
Ching Hua Lee
Chouchang Yang
Yilin Shen
Hongxia Jin
154
2
0
06 Apr 2023
Cluster-Guided Unsupervised Domain Adaptation for Deep Speaker Embedding
IEEE Signal Processing Letters (IEEE SPL), 2023
Haiquan Mao
Fenglu Hong
Man-Wai Mak
183
10
0
28 Mar 2023
Exploring Turkish Speech Recognition via Hybrid CTC/Attention Architecture and Multi-feature Fusion Network
Zeyu Ren
Nurmemet Yolwas
Huiru Wang
Wushour Slamu
75
0
0
22 Mar 2023
DS-TDNN: Dual-stream Time-delay Neural Network with Global-aware Filter for Speaker Verification
IEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023
Yangfu Li
Jiapan Gan
Xiaodan Lin
275
9
0
20 Mar 2023
ERSAM: Neural Architecture Search For Energy-Efficient and Real-Time Social Ambiance Measurement
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Chaojian Li
Wenwan Chen
Jiayi Yuan
Yingyan Lin
Ashutosh Sabharwal
245
0
0
19 Mar 2023
Enhancing Unsupervised Audio Representation Learning via Adversarial Sample Generation
Yulin Pan
Xiangteng He
Biao Gong
Yuxin Peng
Yiliang Lv
SSL
117
0
0
15 Mar 2023
Neural Diarization with Non-autoregressive Intermediate Attractors
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Yusuke Fujita
Tatsuya Komatsu
Robin Scheibler
Yusuke Kida
Tetsuji Ogawa
223
14
0
13 Mar 2023
MixSpeech: Cross-Modality Self-Learning with Audio-Visual Stream Mixup for Visual Speech Translation and Recognition
IEEE International Conference on Computer Vision (ICCV), 2023
Xize Cheng
Lin Li
Tao Jin
Rongjie Huang
Wang Lin
Zehan Wang
Huangdai Liu
Yejin Wang
Aoxiong Yin
Zhou Zhao
210
29
0
09 Mar 2023
TOLD: A Novel Two-Stage Overlap-Aware Framework for Speaker Diarization
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Jiaming Wang
Zhihao Du
Shiliang Zhang
124
8
0
08 Mar 2023
Improving Transformer-based End-to-End Speaker Diarization by Assigning Auxiliary Losses to Attention Heads
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Ye-Rin Jeoung
Joon-Young Yang
Jeong-Hwan Choi
Joon‐Hyuk Chang
70
15
0
02 Mar 2023
Distilling Multi-Level X-vector Knowledge for Small-footprint Speaker Verification
Xuechen Liu
Md. Sahidullah
Tomi Kinnunen
274
6
0
02 Mar 2023
MuAViC: A Multilingual Audio-Visual Corpus for Robust Speech Recognition and Robust Speech-to-Text Translation
Interspeech (Interspeech), 2023
Mohamed Anwar
Bowen Shi
Vedanuj Goswami
Wei-Ning Hsu
J. Pino
Changhan Wang
227
44
0
01 Mar 2023
CAM++: A Fast and Efficient Network for Speaker Verification Using Context-Aware Masking
Interspeech (Interspeech), 2023
Haibo Wang
Siqi Zheng
Yafeng Chen
Luyao Cheng
Qian Chen
160
157
0
01 Mar 2023
Distance-based Weight Transfer from Near-field to Far-field Speaker Verification
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Li Zhang
Qing Wang
Hongji Wang
Yue Li
Wei Rao
Yannan Wang
Linfu Xie
203
5
0
01 Mar 2023
PCF: ECAPA-TDNN with Progressive Channel Fusion for Speaker Verification
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Z. Zhao
Zhuo Li
Wenchao Wang
Pengyuan Zhang
118
33
0
01 Mar 2023
Practice of the conformer enhanced AUDIO-VISUAL HUBERT on Mandarin and English
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Xiaoming Ren
Chao Li
Shenjian Wang
Biao Li
130
0
0
28 Feb 2023
Ensemble knowledge distillation of self-supervised speech models
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Kuan-Po Huang
Tzu-hsun Feng
Yu-Kuan Fu
Tsung-Yuan Hsu
Po-Chieh Yen
Wei-Cheng Tseng
Kai-Wei Chang
Hung-yi Lee
278
21
0
24 Feb 2023
Cross-modal Audio-visual Co-learning for Text-independent Speaker Verification
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Meng Liu
Kong Aik Lee
Longbiao Wang
Hanyi Zhang
Chang Zeng
Jianwu Dang
171
13
0
22 Feb 2023
Advancing Stuttering Detection via Data Augmentation, Class-Balanced Loss and Multi-Contextual Deep Learning
IEEE journal of biomedical and health informatics (IEEE JBHI), 2023
S. A. Sheikh
Md. Sahidullah
F. Hirsch
Slim Ouni
194
23
0
21 Feb 2023
VoxSRC 2022: The Fourth VoxCeleb Speaker Recognition Challenge
Jaesung Huh
A. Brown
Jee-weon Jung
Joon Son Chung
Arsha Nagrani
D. Garcia-Romero
Andrew Zisserman
233
30
0
20 Feb 2023
RobustDistiller: Compressing Universal Speech Representations for Enhanced Environment Robustness
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Heitor R. Guimarães
Arthur Pimentel
Anderson R. Avila
Mehdi Rezagholizadeh
Boxing Chen
Tiago H. Falk
362
12
0
18 Feb 2023
Improving Transformer-based Networks With Locality For Automatic Speaker Verification
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Mufan Sang
Yong Zhao
Gang Liu
John H. L. Hansen
Jian Wu
ViT
216
15
0
17 Feb 2023
Cross-Corpora Spoken Language Identification with Domain Diversification and Generalization
Computer Speech and Language (CSL), 2023
Spandan Dey
Md. Sahidullah
G. Saha
142
13
0
10 Feb 2023
MooseNet: A Trainable Metric for Synthesized Speech with a PLDA Module
Speech Synthesis Workshop (SSW), 2023
Ondvrej Plátek
Ondrej Dusek
188
2
0
17 Jan 2023
Previous
1
2
3
...
5
6
7
...
12
13
14
Next