Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2012.03411
Cited By
v1
v2 (latest)
MLS: A Large-Scale Multilingual Dataset for Speech Research
Interspeech (Interspeech), 2020
7 December 2020
Vineel Pratap
Qiantong Xu
Anuroop Sriram
Gabriel Synnaeve
R. Collobert
AuLLM
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (1 upvotes)
Papers citing
"MLS: A Large-Scale Multilingual Dataset for Speech Research"
50 / 390 papers shown
Reproducing Whisper-Style Training Using an Open-Source Toolkit and Publicly Available Data
Automatic Speech Recognition & Understanding (ASRU), 2023
Yifan Peng
Jinchuan Tian
Brian Yan
Dan Berrebbi
Xuankai Chang
...
Yui Sudo
Muhammad Shakeel
Jee-weon Jung
Soumi Maiti
Shinji Watanabe
VLM
349
63
0
25 Sep 2023
Dynamic ASR Pathways: An Adaptive Masking Approach Towards Efficient Pruning of A Multilingual ASR Model
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Jiamin Xie
Ke Li
Jinxi Guo
Andros Tjandra
Shangguan Yuan
Leda Sari
Chunyang Wu
Junteng Jia
Jay Mahadeokar
Ozlem Kalinli
393
3
0
22 Sep 2023
Multi-Channel MOSRA: Mean Opinion Score and Room Acoustics Estimation Using Simulated Data and a Teacher Model
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Jozef Coldenhoff
Andrew Harper
Paul Kendrick
Tijana Stojkovic
Milos Cernak
195
4
0
21 Sep 2023
Discrete Audio Representation as an Alternative to Mel-Spectrograms for Speaker and Speech Recognition
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Krishna C. Puvvada
Nithin Rao Koluguri
Kunal Dhawan
Jagadeesh Balam
Boris Ginsburg
143
26
0
19 Sep 2023
Investigating End-to-End ASR Architectures for Long Form Audio Transcription
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Nithin Rao Koluguri
Samuel Kriman
Georgy Zelenfroind
Somshubra Majumdar
Dima Rekesh
Vahid Noroozi
Jagadeesh Balam
Boris Ginsburg
AuLLM
241
15
0
18 Sep 2023
Libriheavy: a 50,000 hours ASR corpus with punctuation casing and context
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Wei Kang
Xiaoyu Yang
Zengwei Yao
Fangjun Kuang
Yifan Yang
Liyong Guo
Long Lin
Daniel Povey
252
112
0
15 Sep 2023
Voxtlm: unified decoder-only models for consolidating speech recognition/synthesis and speech/text continuation tasks
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Soumi Maiti
Yifan Peng
Shukjae Choi
Jee-weon Jung
Xuankai Chang
Shinji Watanabe
VLM
AuLLM
365
86
0
14 Sep 2023
LeBenchmark 2.0: a Standardized, Replicable and Enhanced Framework for Self-supervised Representations of French Speech
Computer Speech and Language (CSL), 2023
Titouan Parcollet
H. Nguyen
Solène Evain
Marcely Zanon Boito
Adrien Pupier
...
François Portet
Solange Rossato
Fabien Ringeval
D. Schwab
Laurent Besacier
262
27
0
11 Sep 2023
PromptTTS 2: Describing and Generating Voices with Text Prompt
Yichong Leng
Zhifang Guo
Kai Shen
Xu Tan
Zeqian Ju
...
Lei He
Xiang-Yang Li
Sheng Zhao
Tao Qin
Jiang Bian
VLM
DiffM
274
71
0
05 Sep 2023
RepCodec: A Speech Representation Codec for Speech Tokenization
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Zhichao Huang
Chutong Meng
Tom Ko
217
41
0
31 Aug 2023
SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models
International Conference on Learning Representations (ICLR), 2023
Xin Zhang
Dong Zhang
Shimin Li
Yaqian Zhou
Xipeng Qiu
371
112
0
31 Aug 2023
Improving Small Footprint Few-shot Keyword Spotting with Supervision on Auxiliary Data
Interspeech (Interspeech), 2023
Seunghan Yang
Byeonggeun Kim
Kyuhong Shim
Simyoung Chang
223
2
0
31 Aug 2023
Sparks of Large Audio Models: A Survey and Outlook
S. Latif
Moazzam Shoukat
Fahad Shamshad
Muhammad Usama
Yi Ren
...
Wenwu Wang
Xulong Zhang
Roberto Togneri
Xiaoshi Zhong
Björn W. Schuller
LM&MA
AuLLM
683
53
0
24 Aug 2023
Lip Reading for Low-resource Languages by Learning and Combining General Speech Knowledge and Language-specific Knowledge
IEEE International Conference on Computer Vision (ICCV), 2023
Minsu Kim
Jeong Hun Yeo
J. Choi
Y. Ro
214
28
0
18 Aug 2023
EXPRESSO: A Benchmark and Analysis of Discrete Expressive Speech Resynthesis
Interspeech (Interspeech), 2023
Tu Nguyen
Wei-Ning Hsu
Antony DÁvirro
Bowen Shi
Itai Gat
...
Gabriel Synnaeve
Michael Hassid
Felix Kreuk
Yossi Adi
Emmanuel Dupoux
232
109
0
10 Aug 2023
Federated Representation Learning for Automatic Speech Recognition
Guruprasad V Ramesh
Gopinath Chennupati
Milind Rao
Anit Kumar Sahu
Ariya Rastrow
J. Droppo
210
0
0
03 Aug 2023
An objective evaluation of Hearing Aids and DNN-based speech enhancement in complex acoustic scenes
IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), 2023
Enric Gusó
Joanna Luberadzka
Martí Baig
Umut Sayin Saraç
Xavier Serra
133
5
0
24 Jul 2023
Prompting Large Language Models with Speech Recognition Abilities
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Yassir Fathullah
Chunyang Wu
Egor Lakomkin
Junteng Jia
Yuan Shangguan
...
Wenhan Xiong
Jay Mahadeokar
Ozlem Kalinli
Christian Fuegen
M. Seltzer
AuLLM
236
192
0
21 Jul 2023
MASR: Multi-label Aware Speech Representation
Automatic Speech Recognition & Understanding (ASRU), 2023
Anjali Raj
Shikhar Bharadwaj
Sriram Ganapathy
Min Ma
Shikhar Vashishth
SSL
180
0
0
20 Jul 2023
ivrit.ai: A Comprehensive Dataset of Hebrew Speech for AI Research and Development
Yanir Marmor
Kinneret Misgav
Y. Lifshitz
VLM
276
5
0
17 Jul 2023
Towards cross-language prosody transfer for dialog
Interspeech (Interspeech), 2023
Jonathan Avila
Nigel G. Ward
233
7
0
09 Jul 2023
Enrollment-stage Backdoor Attacks on Speaker Recognition Systems via Adversarial Ultrasound
IEEE Internet of Things Journal (IEEE IoT J.), 2023
Xinfeng Li
Junning Ze
Chen Yan
Yushi Cheng
Xiaoyu Ji
Wei Dong
AAML
198
14
0
28 Jun 2023
Confidence-based Ensembles of End-to-End Speech Recognition Models
Interspeech (Interspeech), 2023
Igor Gitman
Vitaly Lavrukhin
A. Laptev
Boris Ginsburg
UQCV
329
9
0
27 Jun 2023
AudioPaLM: A Large Language Model That Can Speak and Listen
Paul Kishan Rubenstein
Chulayuth Asawaroengchai
D. Nguyen
Ankur Bapna
Zalan Borsos
...
Neil Zeghidour
Yu Zhang
Zhishuai Zhang
Lukás Zilka
Christian Frank
LM&MA
AuLLM
VLM
279
399
0
22 Jun 2023
Unified model for code-switching speech recognition and language identification based on a concatenated tokenizer
Kunal Dhawan
KDimating Rekesh
Boris Ginsburg
256
16
0
14 Jun 2023
Label Aware Speech Representation Learning For Language Identification
Interspeech (Interspeech), 2023
Shikhar Vashishth
Shikhar Bharadwaj
Sriram Ganapathy
Ankur Bapna
Min Ma
Wei Han
Vera Axelrod
Partha P. Talukdar
SSL
137
4
0
07 Jun 2023
Acoustic Word Embeddings for Untranscribed Target Languages with Continued Pretraining and Learned Pooling
Interspeech (Interspeech), 2023
Ramon Sanabria
Ondˇrej Klejch
Hao Tang
Sharon Goldwater
154
4
0
03 Jun 2023
Improved Cross-Lingual Transfer Learning For Automatic Speech Translation
Sameer Khurana
Nauman Dawalatabad
Antoine Laurent
Luis Vicente
Pablo Gimeno
Victoria Mingote
James R. Glass
VLM
371
2
0
01 Jun 2023
How to Estimate Model Transferability of Pre-Trained Speech Models?
Interspeech (Interspeech), 2023
Zih-Ching Chen
Chao-Han Huck Yang
Yue Liu
Yu Zhang
Nanxin Chen
Shoufeng Chang
Rohit Prabhavalkar
Hung-yi Lee
Tara N. Sainath
455
11
0
01 Jun 2023
Edit Distance based RL for RNNT decoding
DongSeon Hwang
Changwan Ryu
K. Sim
169
0
0
31 May 2023
BIG-C: a Multimodal Multi-Purpose Dataset for Bemba
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Claytone Sikasote
Eunice Mukonde
Md Mahfuz Ibn Alam
Antonios Anastasopoulos
179
9
0
26 May 2023
Automatic Tuning of Loss Trade-offs without Hyper-parameter Search in End-to-End Zero-Shot Speech Synthesis
Interspeech (Interspeech), 2023
Seong-Hyun Park
Bohyung Kim
Tae-Hyun Oh
201
1
0
26 May 2023
Scaling Speech Technology to 1,000+ Languages
Journal of machine learning research (JMLR), 2023
Vineel Pratap
Andros Tjandra
Bowen Shi
Paden Tomasello
Arun Babu
...
Yossi Adi
Xiaohui Zhang
Wei-Ning Hsu
Alexis Conneau
Michael Auli
VLM
401
538
0
22 May 2023
Textually Pretrained Speech Language Models
Neural Information Processing Systems (NeurIPS), 2023
Michael Hassid
Tal Remez
Tu Nguyen
Itai Gat
Alexis Conneau
...
Alexandre Défossez
Gabriel Synnaeve
Emmanuel Dupoux
Roy Schwartz
Yossi Adi
VLM
SyDa
429
94
0
22 May 2023
Comparison of Multilingual Self-Supervised and Weakly-Supervised Speech Pre-Training for Adaptation to Unseen Languages
Interspeech (Interspeech), 2023
Andrew Rouditchenko
Sameer Khurana
Samuel Thomas
Rogerio Feris
Leonid Karlinsky
Hilde Kuehne
David Harwath
Brian Kingsbury
James R. Glass
VLM
297
25
0
21 May 2023
Language-universal phonetic encoder for low-resource speech recognition
Interspeech (Interspeech), 2023
Siyuan Feng
Ming Tu
Rui Xia
Chuanzeng Huang
Yuxuan Wang
218
3
0
19 May 2023
Language-Universal Phonetic Representation in Multilingual Speech Pretraining for Low-Resource Speech Recognition
Interspeech (Interspeech), 2023
Siyuan Feng
Ming Tu
Rui Xia
Chuanzeng Huang
Yuxuan Wang
168
8
0
19 May 2023
ML-SUPERB: Multilingual Speech Universal PERformance Benchmark
Interspeech (Interspeech), 2023
Jiatong Shi
Dan Berrebbi
William Chen
Ho-Lam Chung
En-Pei Hu
...
Xuankai Chang
Shang-Wen Li
Abdel-rahman Mohamed
Hung-yi Lee
Shinji Watanabe
ELM
333
88
0
18 May 2023
Understanding and Bridging the Modality Gap for Speech Translation
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Qingkai Fang
Yang Feng
260
29
0
15 May 2023
Exploration of Language Dependency for Japanese Self-Supervised Speech Representation Models
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Takanori Ashihara
Takafumi Moriya
Kohei Matsuura
Tomohiro Tanaka
215
4
0
09 May 2023
Fast Conformer with Linearly Scalable Attention for Efficient Speech Recognition
Automatic Speech Recognition & Understanding (ASRU), 2023
Dima Rekesh
Nithin Rao Koluguri
Samuel Kriman
Somshubra Majumdar
Vahid Noroozi
...
Oleksii Hrinchuk
Krishna Puvvada
Ankur Kumar
Jagadeesh Balam
Boris Ginsburg
333
145
0
08 May 2023
NaturalSpeech 2: Latent Diffusion Models are Natural and Zero-Shot Speech and Singing Synthesizers
International Conference on Learning Representations (ICLR), 2023
Kai Shen
Zeqian Ju
Xu Tan
Yanqing Liu
Yichong Leng
Lei He
Tao Qin
Sheng Zhao
Jiang Bian
DiffM
302
333
0
18 Apr 2023
Efficient Sequence Transduction by Jointly Predicting Tokens and Durations
International Conference on Machine Learning (ICML), 2023
Hainan Xu
Fei Jia
Somshubra Majumdar
Hengguan Huang
Shinji Watanabe
Boris Ginsburg
187
45
0
13 Apr 2023
Enhancing Unsupervised Speech Recognition with Diffusion GANs
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Xianchao Wu
DiffM
190
2
0
23 Mar 2023
Configurable EBEN: Extreme Bandwidth Extension Network to enhance body-conducted speech capture
IEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023
Hauret Julien
Joubaud Thomas
V. Zimpfer
Bavu Éric
234
10
0
17 Mar 2023
Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages
Yu Zhang
Wei Han
James Qin
Yongqiang Wang
Ankur Bapna
...
Pedro J. Moreno
Chung-Cheng Chiu
J. Schalkwyk
Franccoise Beaufays
Yonghui Wu
VLM
410
348
0
02 Mar 2023
Improving Massively Multilingual ASR With Auxiliary CTC Objectives
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
William Chen
Brian Yan
Jiatong Shi
Yifan Peng
Soumi Maiti
Shinji Watanabe
266
51
0
24 Feb 2023
Catch You and I Can: Revealing Source Voiceprint Against Voice Conversion
USENIX Security Symposium (USENIX Security), 2023
Jiangyi Deng
Yanjiao Chen
Yinan Zhong
Qianhao Miao
Xueluan Gong
Wenyuan Xu Zhejiang University
250
13
0
24 Feb 2023
Speaker and Language Change Detection using Wav2vec2 and Whisper
Tijn Berns
Nik Vaessen
David A. van Leeuwen
170
6
0
18 Feb 2023
ASR Bundestag: A Large-Scale political debate dataset in German
Intelligent Systems with Applications (ISA), 2023
Johannes Wirth
René Peinl
211
2
0
12 Feb 2023
Previous
1
2
3
4
5
6
7
8
Next
Page 6 of 8