Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1510.08484
Cited By
MUSAN: A Music, Speech, and Noise Corpus
28 October 2015
David Snyder
Guoguo Chen
Daniel Povey
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"MUSAN: A Music, Speech, and Noise Corpus"
50 / 664 papers shown
Fast Entropy-Based Methods of Word-Level Confidence Estimation for End-To-End Automatic Speech Recognition
Spoken Language Technology Workshop (SLT), 2022
A. Laptev
Boris Ginsburg
206
12
0
16 Dec 2022
Leveraging Modality-specific Representations for Audio-visual Speech Recognition via Reinforcement Learning
AAAI Conference on Artificial Intelligence (AAAI), 2022
Chen Chen
Yuchen Hu
Qiang Zhang
Heqing Zou
Beier Zhu
Eng Siong Chng
263
34
0
10 Dec 2022
GPU-accelerated Guided Source Separation for Meeting Transcription
Interspeech (Interspeech), 2022
Desh Raj
Daniel Povey
Sanjeev Khudanpur
321
47
0
10 Dec 2022
Covariance Regularization for Probabilistic Linear Discriminant Analysis
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Zhiyuan Peng
Mingjie Shao
Xuanji He
Xu Li
Tan Lee
Ke Ding
Guanglu Wan
123
2
0
06 Dec 2022
Self-Supervised Audio-Visual Speech Representations Learning By Multimodal Self-Distillation
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Jing-Xuan Zhang
Genshun Wan
Zhenhua Ling
Jia Pan
Jianqing Gao
Cong Liu
SSL
225
15
0
06 Dec 2022
A General Unfolding Speech Enhancement Method Motivated by Taylor's Theorem
IEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2022
Andong Li
Guochen Yu
C. Zheng
Wenzhe Liu
Xiaodong Li
275
21
0
30 Nov 2022
MSV Challenge 2022: NPU-HC Speaker Verification System for Low-resource Indian Languages
Yue Li
Li Zhang
Na Wang
Jie Liu
Linfu Xie
151
0
0
30 Nov 2022
TaylorBeamixer: Learning Taylor-Inspired All-Neural Multi-Channel Speech Enhancement from Beam-Space Dictionary Perspective
Interspeech (Interspeech), 2022
Andong Li
Guochen Yu
Wenzhe Liu
Xiaodong Li
C. Zheng
224
2
0
22 Nov 2022
VATLM: Visual-Audio-Text Pre-Training with Unified Masked Prediction for Speech Representation Learning
IEEE transactions on multimedia (IEEE TMM), 2022
Qiu-shi Zhu
Long Zhou
Zi-Hua Zhang
Shujie Liu
Binxing Jiao
Jie Zhang
Lirong Dai
Daxin Jiang
Jinyu Li
Furu Wei
265
50
0
21 Nov 2022
Simultaneously Learning Robust Audio Embeddings and balanced Hash codes for Query-by-Example
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Anup Singh
Kris Demuynck
Vipul Arora
100
8
0
20 Nov 2022
Speaker Overlap-aware Neural Diarization for Multi-party Meeting Analysis
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Zhihao Du
Shiliang Zhang
Siqi Zheng
Zhijie Yan
101
20
0
18 Nov 2022
Multi-source Domain Adaptation for Text-independent Forensic Speaker Recognition
IEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2022
Zhenyu Wang
John H. L. Hansen
181
25
0
17 Nov 2022
Speaker Adaptation for End-To-End Speech Recognition Systems in Noisy Environments
Automatic Speech Recognition & Understanding (ASRU), 2022
Dominik Wagner
Ilja Baumann
Sebastian P. Bayerl
Korbinian Riedhammer
Tobias Bocklet
248
3
0
16 Nov 2022
The Potential of Neural Speech Synthesis-based Data Augmentation for Personalized Speech Enhancement
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Anastasia Kuznetsova
Aswin Sivaraman
Minje Kim
209
6
0
14 Nov 2022
Towards A Unified Conformer Structure: from ASR to ASV Task
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Dexin Liao
Tao Jiang
Feng Wang
Lin Li
Q. Hong
189
14
0
14 Nov 2022
Multi-Speaker and Wide-Band Simulated Conversations as Training Data for End-to-End Neural Diarization
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Federico Landini
Mireia Díez
Alicia Lozano-Diez
L. Burget
200
22
0
12 Nov 2022
Improving the Robustness of DistilHuBERT to Unseen Noisy Conditions via Data Augmentation, Curriculum Learning, and Multi-Task Enhancement
Heitor R. Guimarães
Arthur Pimentel
Anderson R. Avila
Mehdi Rezagholizadeh
Tiago H. Falk
286
3
0
12 Nov 2022
Low Pass Filtering and Bandwidth Extension for Robust Anti-spoofing Countermeasure Against Codec Variabilities
International Symposium on Chinese Spoken Language Processing (ISCSLP), 2022
Yikang Wang
Xingming Wang
Hiromitsu Nishizaki
Ming Li
140
5
0
12 Nov 2022
Handling Trade-Offs in Speech Separation with Sparsely-Gated Mixture of Experts
Xiaofei Wang
Zhuo Chen
Yu Shi
Jian Wu
Naoyuki Kanda
Takuya Yoshioka
MoE
175
2
0
11 Nov 2022
High-resolution embedding extractor for speaker diarisation
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Hee-Soo Heo
Youngki Kwon
Bong-Jin Lee
You Jin Kim
Jee-weon Jung
210
5
0
08 Nov 2022
Late Audio-Visual Fusion for In-The-Wild Speaker Diarization
Zexu Pan
Gordon Wichern
François Germain
Aswin Shanmugam Subramanian
Jonathan Le Roux
VGen
317
2
0
02 Nov 2022
data2vec-aqc: Search for the right Teaching Assistant in the Teacher-Student training setup
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Vasista Sai Lodagala
Sreyan Ghosh
S. Umesh
SSL
158
5
0
02 Nov 2022
I4U System Description for NIST SRE'20 CTS Challenge
Kong Aik Lee
Tomi Kinnunen
Daniele Colibro
C. Vair
A. Nautsch
...
Ruijie Tao
Haizhou Li
Alfonso Ortega Giménez
Longbiao Wang
L. Buera
80
1
0
02 Nov 2022
LMD: A Learnable Mask Network to Detect Adversarial Examples for Speaker Verification
IEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2022
Xingqi Chen
Jie Wang
Xiaoli Zhang
Weiqiang Zhang
Kunde Yang
AAML
278
10
0
02 Nov 2022
Build a SRE Challenge System: Lessons from VoxSRC 2022 and CNSRC 2022
Interspeech (Interspeech), 2022
Zhengyang Chen
Bing Han
Xu Xiang
Houjun Huang
Bei Liu
Y. Qian
219
17
0
02 Nov 2022
Metric Learning for User-defined Keyword Spotting
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Jaemin Jung
You-kyong. Kim
Jihwan Park
Youshin Lim
Byeong-Yeol Kim
Youngjoon Jang
Joon Son Chung
232
16
0
01 Nov 2022
Waveform Boundary Detection for Partially Spoofed Audio
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Zexin Cai
Weiqing Wang
Ming Li
89
35
0
01 Nov 2022
Model Compression for DNN-based Speaker Verification Using Weight Quantization
Interspeech (Interspeech), 2022
Jingyu Li
W. Liu
Zhaoyang Zhang
Jiong Wang
Tan Lee
MQ
385
3
0
31 Oct 2022
Convolution-Based Channel-Frequency Attention for Text-Independent Speaker Verification
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Jingyu Li
Yusheng Tian
Tan Lee
113
14
0
31 Oct 2022
Fast and parallel decoding for transducer
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Wei Kang
Liyong Guo
Fangjun Kuang
Long Lin
Mingshuang Luo
Zengwei Yao
Xiaoyu Yang
Piotr Żelasko
Daniel Povey
AI4TS
257
19
0
31 Oct 2022
Delay-penalized transducer for low-latency streaming ASR
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Wei Kang
Zengwei Yao
Fangjun Kuang
Liyong Guo
Xiaoyu Yang
Long lin
Piotr Żelasko
Daniel Povey
253
11
0
31 Oct 2022
Predicting Multi-Codebook Vector Quantization Indexes for Knowledge Distillation
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Liyong Guo
Xiaoyu Yang
Quandong Wang
Yuxiang Kong
Zengwei Yao
...
Wei Kang
Long Lin
Mingshuang Luo
Piotr Żelasko
Daniel Povey
VLM
188
10
0
31 Oct 2022
Wespeaker: A Research and Production oriented Speaker Embedding Learning Toolkit
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Hongji Wang
Che-Yuan Liang
Shuai Wang
Zhengyang Chen
Binbin Zhang
Xu Xiang
Yan Deng
Y. Qian
282
194
0
31 Oct 2022
SRTNet: Time Domain Speech Enhancement Via Stochastic Refinement
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Zhibin Qiu
Mengfan Fu
Yinfeng Yu
Lili Yin
Gang Hua
Hao-Ming Huang
DiffM
264
21
0
30 Oct 2022
Adaptive Speech Quality Aware Complex Neural Network for Acoustic Echo Cancellation with Supervised Contrastive Learning
Bozhong Liu
Xiaoxi Yu
Hantao Huang
254
0
0
30 Oct 2022
Speaker Representation Learning via Contrastive Loss with Maximal Speaker Separability
Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 2022
Zhe Li
Man-Wai Mak
SSL
263
9
0
29 Oct 2022
Target-Speaker Voice Activity Detection via Sequence-to-Sequence Prediction
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Ming Cheng
Weiqing Wang
Yucong Zhang
Xiaoyi Qin
Ming Li
VLM
336
45
0
28 Oct 2022
A comprehensive study on self-supervised distillation for speaker representation learning
Spoken Language Technology Workshop (SLT), 2022
Zhengyang Chen
Yao Qian
Bing Han
Y. Qian
Michael Zeng
SSL
345
23
0
28 Oct 2022
Speaker recognition with two-step multi-modal deep cleansing
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Ruijie Tao
Kong Aik Lee
Zhan Shi
Haizhou Li
NoLa
139
23
0
28 Oct 2022
Self-Supervised Training of Speaker Encoder with Multi-Modal Diverse Positive Pairs
IEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2022
Ruijie Tao
Kong Aik Lee
Rohan Kumar Das
Ville Hautamaki
Haizhou Li
SSL
216
15
0
27 Oct 2022
Robust Data2vec: Noise-robust Speech Representation Learning for ASR by Combining Regression and Improved Contrastive Learning
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Qiu-shi Zhu
Long Zhou
Jie Zhang
Shujie Liu
Yu-Chen Hu
Lirong Dai
VLM
SSL
179
43
0
27 Oct 2022
TSUP Speaker Diarization System for Conversational Short-phrase Speaker Diarization Challenge
International Symposium on Chinese Spoken Language Processing (ISCSLP), 2022
Bowen Pang
Huan Zhao
Gaosheng Zhang
Xiaoyue Yang
Yanguo Sun
Li Zhang
Qing Wang
Linfu Xie
BDL
155
3
0
26 Oct 2022
Speaker Diarization Based on Multi-channel Microphone Array in Small-scale Meeting
Yu Du
R. Zhou
91
1
0
26 Oct 2022
Improving Speech-to-Speech Translation Through Unlabeled Text
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Xuan-Phi Nguyen
Sravya Popuri
Changhan Wang
Yun Tang
Ilia Kulikov
Hongyu Gong
196
9
0
26 Oct 2022
Large-scale learning of generalised representations for speaker recognition
Jee-weon Jung
Hee-Soo Heo
Bong-Jin Lee
Jaesong Lee
Hye-jin Shim
Youngki Kwon
Joon Son Chung
Shinji Watanabe
CVBM
206
6
0
20 Oct 2022
How to Leverage DNN-based speech enhancement for multi-channel speaker verification?
Sandipana Dowerah
Romain Serizel
D. Jouvet
Mohammad MohammadAmini
D. Matrouf
150
2
0
17 Oct 2022
spatial-dccrn: dccrn equipped with frame-level angle feature and hybrid filtering for multi-channel speech enhancement
Spoken Language Technology Workshop (SLT), 2022
Shubo Lv
Yihui Fu
Yukai Jv
Linfu Xie
Weixin Zhu
Wei Rao
Yannan Wang
154
11
0
17 Oct 2022
Attention-Based Audio Embeddings for Query-by-Example
International Society for Music Information Retrieval Conference (ISMIR), 2022
Anup Singh
Kris Demuynck
Vipul Arora
107
13
0
16 Oct 2022
Improving generalizability of distilled self-supervised speech processing models under distorted settings
Spoken Language Technology Workshop (SLT), 2022
Kuan-Po Huang
Yu-Kuan Fu
Tsung-Yuan Hsu
Fabian Ritter-Gutierrez
Fan Wang
Liang-Hsuan Tseng
Yu Zhang
Hung-yi Lee
247
15
0
14 Oct 2022
Description and analysis of novelties introduced in DCASE Task 4 2022 on the baseline system
Workshop on Detection and Classification of Acoustic Scenes and Events (DCASE), 2022
Francesca Ronchini
Samuele Cornell
Romain Serizel
Nicolas Turpault
Eduardo Fonseca
D. Ellis
161
14
0
14 Oct 2022
Previous
1
2
3
...
6
7
8
...
12
13
14
Next