Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2104.14830
Cited By
v1
v2 (latest)
Scaling End-to-End Models for Large-Scale Multilingual ASR
Automatic Speech Recognition & Understanding (ASRU), 2021
30 April 2021
Yue Liu
Ruoming Pang
Tara N. Sainath
Anmol Gulati
Yu Zhang
James Qin
Parisa Haghani
Wenjie Huang
Min Ma
Junwen Bai
CLL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Scaling End-to-End Models for Large-Scale Multilingual ASR"
50 / 61 papers shown
MLMA: Towards Multilingual ASR With Mamba-based Architectures
Mohamed Nabih Ali
Daniele Falavigna
Alessio Brutti
Mamba
322
0
0
21 Oct 2025
Long Chain-of-Thought Reasoning Across Languages
Josh Barua
Seun Eisape
Kayo Yin
Alane Suhr
LRM
ELM
248
6
0
20 Aug 2025
GigaAM: Efficient Self-Supervised Learner for Speech Recognition
Aleksandr Kutsakov
Alexandr Maximenko
Georgii Gospodinov
Pavel Bogomolov
Fyodor Minkin
292
2
0
01 Jun 2025
Whisper-LM: Improving ASR Models with Language Models for Low-Resource Languages
Xabier de Zuazo
Eva Navas
Ibon Saratxaga
Inma Hernáez Rioja
382
5
0
30 Mar 2025
LUPET: Incorporating Hierarchical Information Path into Multilingual ASR
Interspeech (Interspeech), 2024
Wei Liu
Jingyong Hou
Dong Yang
Muyong Cao
Tan Lee
620
2
0
10 Jan 2025
A two-stage transliteration approach to improve performance of a multilingual ASR
Rohit Kumar
218
0
0
09 Oct 2024
Exploring SSL Discrete Tokens for Multilingual ASR
Mingyu Cui
Daxin Tan
Yifan Yang
Dingdong Wang
Huimeng Wang
Xiao Chen
Xie Chen
Xunying Liu
347
6
0
13 Sep 2024
Learn and Don't Forget: Adding a New Language to ASR Foundation Models
Mengjie Qian
Siyuan Tang
Rao Ma
Kate Knill
Mark Gales
CLL
366
21
0
09 Jul 2024
GigaSpeech 2: An Evolving, Large-Scale and Multi-domain ASR Corpus for Low-Resource Languages with Automated Crawling, Transcription and Refinement
Yifan Yang
Zheshu Song
Jianheng Zhuo
Mingyu Cui
Jinpeng Li
...
Shuai Fan
Kai Yu
Wei Zhang
Guoguo Chen
Xie Chen
649
42
0
17 Jun 2024
Dual-Pipeline with Low-Rank Adaptation for New Language Integration in Multilingual ASR
Yerbolat Khassanov
Zhipeng Chen
Tianfeng Chen
Tze Yuang Chong
Wei Li
Jun Zhang
Lu Lu
Yuxuan Wang
AI4CE
267
3
0
12 Jun 2024
A Parameter-efficient Language Extension Framework for Multilingual ASR
Wei Liu
Jingyong Hou
Dong Yang
Muyong Cao
Tan Lee
CLL
333
5
0
10 Jun 2024
USM RNN-T model weights binarization
Oleg Rybakov
Dmitriy Serdyuk
Chengjian Zheng
MQ
360
2
0
05 Jun 2024
Whistle: Data-Efficient Multilingual and Crosslingual Speech Recognition via Weakly Phonetic Supervision
Saierdaer Yusuyin
Te Ma
Hao Huang
Wenbo Zhao
Zhijian Ou
424
14
0
04 Jun 2024
Exploring neural oscillations during speech perception via surrogate gradient spiking neural networks
Alexandre Bittar
Philip N. Garner
199
3
0
22 Apr 2024
Multi-modal Deep Learning
Chen Yuhua
MedIm
430
54
0
06 Mar 2024
Extreme Encoder Output Frame Rate Reduction: Improving Computational Latencies of Large End-to-End Models
Rohit Prabhavalkar
Zhong Meng
Weiran Wang
Adam Stooke
Xingyu Cai
Yanzhang He
Arun Narayanan
Dongseong Hwang
Tara N. Sainath
Pedro J. Moreno
249
11
0
27 Feb 2024
Multilingual and Fully Non-Autoregressive ASR with Large Language Model Fusion: A Comprehensive Study
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Wenjie Huang
Cyril Allauzen
Tongzhou Chen
Kilol Gupta
Ke Hu
James Qin
Yu Zhang
Yongqiang Wang
Shuo-yiin Chang
Tara N. Sainath
MoMe
307
19
0
23 Jan 2024
Efficient Adapter Finetuning for Tail Languages in Streaming Multilingual ASR
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Junwen Bai
Yue Liu
Qiujia Li
Tara N. Sainath
Trevor Strohman
381
8
0
17 Jan 2024
Findings of the 2023 ML-SUPERB Challenge: Pre-Training and Evaluation over More Languages and Beyond
Automatic Speech Recognition & Understanding (ASRU), 2023
Jiatong Shi
William Chen
Dan Berrebbi
Hsiu-Hsuan Wang
Wei-Ping Huang
...
Yuxun Tang
Shang-Wen Li
Abdelrahman Mohamed
Hung-yi Lee
Shinji Watanabe
LRM
ELM
458
18
0
09 Oct 2023
UniverSLU: Universal Spoken Language Understanding for Diverse Tasks with Natural Language Instructions
North American Chapter of the Association for Computational Linguistics (NAACL), 2023
Siddhant Arora
Hayato Futami
Jee-weon Jung
Yifan Peng
Roshan S. Sharma
Yosuke Kashiwagi
E. Tsunoo
Karen Livescu
Shinji Watanabe
ELM
295
12
0
04 Oct 2023
SSHR: Leveraging Self-supervised Hierarchical Representations for Multilingual Automatic Speech Recognition
IEEE International Conference on Multimedia and Expo (ICME), 2023
Hongfei Xue
Qijie Shao
Tommy Yuan
Peikun Chen
Jie Liu
Lei Xie
311
6
0
29 Sep 2023
Generative Speech Recognition Error Correction with Large Language Models and Task-Activating Prompting
Automatic Speech Recognition & Understanding (ASRU), 2023
Chao-Han Huck Yang
Yile Gu
Yi-Chieh Liu
Shalini Ghosh
I. Bulyko
A. Stolcke
KELM
LRM
491
91
0
27 Sep 2023
Reproducing Whisper-Style Training Using an Open-Source Toolkit and Publicly Available Data
Automatic Speech Recognition & Understanding (ASRU), 2023
Yifan Peng
Jinchuan Tian
Brian Yan
Dan Berrebbi
Xuankai Chang
...
Yui Sudo
Muhammad Shakeel
Jee-weon Jung
Soumi Maiti
Shinji Watanabe
VLM
410
72
0
25 Sep 2023
Improving vision-inspired keyword spotting using dynamic module skipping in streaming conformer encoder
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Alexandre Bittar
Paul Dixon
Mohammad Samragh
K. Nishu
Devang Naik
317
7
0
31 Aug 2023
Cascaded encoders for fine-tuning ASR models on overlapped speech
Interspeech (Interspeech), 2023
R. Rose
Oscar Chang
Olivier Siohan
173
2
0
28 Jun 2023
Master-ASR: Achieving Multilingual Scalability and Low-Resource Adaptation in ASR with Modular Learning
International Conference on Machine Learning (ICML), 2023
Zhongzhi Yu
Yang Zhang
Kaizhi Qian
Y. Fu
Yingyan Lin
300
17
0
23 Jun 2023
Unified model for code-switching speech recognition and language identification based on a concatenated tokenizer
Kunal Dhawan
KDimating Rekesh
Boris Ginsburg
296
17
0
14 Jun 2023
Scaling Speech Technology to 1,000+ Languages
Journal of machine learning research (JMLR), 2023
Vineel Pratap
Andros Tjandra
Bowen Shi
Paden Tomasello
Arun Babu
...
Yossi Adi
Xiaohui Zhang
Wei-Ning Hsu
Alexis Conneau
Michael Auli
VLM
524
586
0
22 May 2023
Language-universal phonetic encoder for low-resource speech recognition
Interspeech (Interspeech), 2023
Siyuan Feng
Ming Tu
Rui Xia
Chuanzeng Huang
Yuxuan Wang
265
3
0
19 May 2023
Language-Universal Phonetic Representation in Multilingual Speech Pretraining for Low-Resource Speech Recognition
Interspeech (Interspeech), 2023
Siyuan Feng
Ming Tu
Rui Xia
Chuanzeng Huang
Yuxuan Wang
201
8
0
19 May 2023
End-to-End Speech Recognition: A Survey
IEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023
Rohit Prabhavalkar
Takaaki Hori
Tara N. Sainath
Ralf Schluter
Shinji Watanabe
VLM
361
276
0
03 Mar 2023
Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages
Yu Zhang
Wei Han
James Qin
Yongqiang Wang
Ankur Bapna
...
Pedro J. Moreno
Chung-Cheng Chiu
J. Schalkwyk
Franccoise Beaufays
Yonghui Wu
VLM
533
370
0
02 Mar 2023
Improving Massively Multilingual ASR With Auxiliary CTC Objectives
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
William Chen
Brian Yan
Jiatong Shi
Yifan Peng
Soumi Maiti
Shinji Watanabe
348
53
0
24 Feb 2023
UML: A Universal Monolingual Output Layer for Multilingual ASR
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Chaoyang Zhang
Yue Liu
Tara N. Sainath
Trevor Strohman
Shuo-yiin Chang
295
7
0
22 Feb 2023
Efficient Domain Adaptation for Speech Foundation Models
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Yue Liu
DongSeon Hwang
Zhouyuan Huo
Junwen Bai
Guru Prakash
...
K. Sim
Yu Zhang
Wei Han
Trevor Strohman
F. Beaufays
AI4CE
321
30
0
03 Feb 2023
From English to More Languages: Parameter-Efficient Model Reprogramming for Cross-Lingual Speech Recognition
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Chao-Han Huck Yang
Yue Liu
Yu Zhang
Nanxin Chen
Rohit Prabhavalkar
Tara N. Sainath
Trevor Strohman
226
33
0
19 Jan 2023
Improved Self-Supervised Multilingual Speech Representation Learning Combined with Auxiliary Language Information
Fenglin Ding
Genshun Wan
Pengcheng Li
Jia Pan
Cong Liu
SSL
319
1
0
07 Dec 2022
Massively Multilingual ASR on 70 Languages: Tokenization, Architecture, and Generalization Capabilities
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Andros Tjandra
Nayan Singhal
David C. Zhang
Ozlem Kalinli
Abdel-rahman Mohamed
Duc Le
M. Seltzer
263
21
0
10 Nov 2022
Towards Zero-Shot Code-Switched Speech Recognition
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Brian Yan
Sanjeev Khudanpur
Ondˇrej Klejch
Preethi Jyothi
Shinji Watanabe
297
24
0
02 Nov 2022
Scaling Up Deliberation for Multilingual ASR
Spoken Language Technology Workshop (SLT), 2022
Ke Hu
Yue Liu
Tara N. Sainath
LRM
344
11
0
11 Oct 2022
Streaming End-to-End Multilingual Speech Recognition with Joint Language Identification
Interspeech (Interspeech), 2022
Chuxu Zhang
Yue Liu
Tara N. Sainath
Trevor Strohman
S. Mavandadi
Shuo-yiin Chang
Parisa Haghani
321
35
0
13 Sep 2022
Learning ASR pathways: A sparse multilingual ASR model
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Mu Yang
Andros Tjandra
Chunxi Liu
David C. Zhang
Duc Le
Ozlem Kalinli
482
14
0
13 Sep 2022
A Language Agnostic Multilingual Streaming On-Device ASR System
Interspeech (Interspeech), 2022
Yue Liu
Tara N. Sainath
Ruoming Pang
Shuo-yiin Chang
Qiumin Xu
...
Qiao Liang
Heguang Liu
Yanzhang He
Parisa Haghani
Sameer Bidichandani
AuLLM
223
14
0
29 Aug 2022
FLEURS: Few-shot Learning Evaluation of Universal Representations of Speech
Spoken Language Technology Workshop (SLT), 2022
Alexis Conneau
Min Ma
Simran Khanuja
Yu Zhang
Vera Axelrod
Siddharth Dalmia
Jason Riesa
Clara E. Rivera
Ankur Bapna
VLM
570
548
0
25 May 2022
Combining Spectral and Self-Supervised Features for Low Resource Speech Recognition and Translation
Interspeech (Interspeech), 2022
Dan Berrebbi
Jiatong Shi
Brian Yan
Osbel López-Francisco
Jonathan D. Amith
Shinji Watanabe
258
32
0
05 Apr 2022
Language Adaptive Cross-lingual Speech Representation Learning with Sparse Sharing Sub-networks
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Yizhou Lu
Mingkun Huang
Xinghua Qu
Pengfei Wei
Zejun Ma
307
21
0
09 Mar 2022
Self-supervised Learning with Random-projection Quantizer for Speech Recognition
International Conference on Machine Learning (ICML), 2022
Chung-Cheng Chiu
James Qin
Yu Zhang
Jiahui Yu
Yonghui Wu
SSL
354
237
0
03 Feb 2022
Improving the fusion of acoustic and text representations in RNN-T
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Chao Zhang
Yue Liu
Zhiyun Lu
Tara N. Sainath
Shuo-yiin Chang
AI4CE
347
14
0
25 Jan 2022
Building a great multi-lingual teacher with sparsely-gated mixture of experts for speech recognition
K. Kumatani
R. Gmyr
Andres Felipe Cruz Salinas
Linquan Liu
Wei Zuo
Devang Patel
Eric Sun
Yu Shi
MoE
369
22
0
10 Dec 2021
XLS-R: Self-supervised Cross-lingual Speech Representation Learning at Scale
Arun Babu
Changhan Wang
Andros Tjandra
Kushal Lakhotia
Qiantong Xu
...
Yatharth Saraf
J. Pino
Alexei Baevski
Alexis Conneau
Michael Auli
SSL
567
982
0
17 Nov 2021
1
2
Next
Page 1 of 2