ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2104.14830
  4. Cited By
Scaling End-to-End Models for Large-Scale Multilingual ASR
v1v2 (latest)

Scaling End-to-End Models for Large-Scale Multilingual ASR

Automatic Speech Recognition & Understanding (ASRU), 2021
30 April 2021
Yue Liu
Ruoming Pang
Tara N. Sainath
Anmol Gulati
Yu Zhang
James Qin
Parisa Haghani
Wenjie Huang
Min Ma
Junwen Bai
    CLL
ArXiv (abs)PDFHTML

Papers citing "Scaling End-to-End Models for Large-Scale Multilingual ASR"

50 / 61 papers shown
MLMA: Towards Multilingual ASR With Mamba-based Architectures
MLMA: Towards Multilingual ASR With Mamba-based Architectures
Mohamed Nabih Ali
Daniele Falavigna
Alessio Brutti
Mamba
322
0
0
21 Oct 2025
Long Chain-of-Thought Reasoning Across Languages
Long Chain-of-Thought Reasoning Across Languages
Josh Barua
Seun Eisape
Kayo Yin
Alane Suhr
LRMELM
248
6
0
20 Aug 2025
GigaAM: Efficient Self-Supervised Learner for Speech Recognition
GigaAM: Efficient Self-Supervised Learner for Speech Recognition
Aleksandr Kutsakov
Alexandr Maximenko
Georgii Gospodinov
Pavel Bogomolov
Fyodor Minkin
292
2
0
01 Jun 2025
Whisper-LM: Improving ASR Models with Language Models for Low-Resource Languages
Whisper-LM: Improving ASR Models with Language Models for Low-Resource Languages
Xabier de Zuazo
Eva Navas
Ibon Saratxaga
Inma Hernáez Rioja
382
5
0
30 Mar 2025
LUPET: Incorporating Hierarchical Information Path into Multilingual ASR
LUPET: Incorporating Hierarchical Information Path into Multilingual ASRInterspeech (Interspeech), 2024
Wei Liu
Jingyong Hou
Dong Yang
Muyong Cao
Tan Lee
620
2
0
10 Jan 2025
A two-stage transliteration approach to improve performance of a
  multilingual ASR
A two-stage transliteration approach to improve performance of a multilingual ASR
Rohit Kumar
218
0
0
09 Oct 2024
Exploring SSL Discrete Tokens for Multilingual ASR
Exploring SSL Discrete Tokens for Multilingual ASR
Mingyu Cui
Daxin Tan
Yifan Yang
Dingdong Wang
Huimeng Wang
Xiao Chen
Xie Chen
Xunying Liu
347
6
0
13 Sep 2024
Learn and Don't Forget: Adding a New Language to ASR Foundation Models
Learn and Don't Forget: Adding a New Language to ASR Foundation Models
Mengjie Qian
Siyuan Tang
Rao Ma
Kate Knill
Mark Gales
CLL
366
21
0
09 Jul 2024
GigaSpeech 2: An Evolving, Large-Scale and Multi-domain ASR Corpus for Low-Resource Languages with Automated Crawling, Transcription and Refinement
GigaSpeech 2: An Evolving, Large-Scale and Multi-domain ASR Corpus for Low-Resource Languages with Automated Crawling, Transcription and Refinement
Yifan Yang
Zheshu Song
Jianheng Zhuo
Mingyu Cui
Jinpeng Li
...
Shuai Fan
Kai Yu
Wei Zhang
Guoguo Chen
Xie Chen
649
42
0
17 Jun 2024
Dual-Pipeline with Low-Rank Adaptation for New Language Integration in
  Multilingual ASR
Dual-Pipeline with Low-Rank Adaptation for New Language Integration in Multilingual ASR
Yerbolat Khassanov
Zhipeng Chen
Tianfeng Chen
Tze Yuang Chong
Wei Li
Jun Zhang
Lu Lu
Yuxuan Wang
AI4CE
267
3
0
12 Jun 2024
A Parameter-efficient Language Extension Framework for Multilingual ASR
A Parameter-efficient Language Extension Framework for Multilingual ASR
Wei Liu
Jingyong Hou
Dong Yang
Muyong Cao
Tan Lee
CLL
333
5
0
10 Jun 2024
USM RNN-T model weights binarization
USM RNN-T model weights binarization
Oleg Rybakov
Dmitriy Serdyuk
Chengjian Zheng
MQ
360
2
0
05 Jun 2024
Whistle: Data-Efficient Multilingual and Crosslingual Speech Recognition via Weakly Phonetic Supervision
Whistle: Data-Efficient Multilingual and Crosslingual Speech Recognition via Weakly Phonetic Supervision
Saierdaer Yusuyin
Te Ma
Hao Huang
Wenbo Zhao
Zhijian Ou
424
14
0
04 Jun 2024
Exploring neural oscillations during speech perception via surrogate
  gradient spiking neural networks
Exploring neural oscillations during speech perception via surrogate gradient spiking neural networks
Alexandre Bittar
Philip N. Garner
199
3
0
22 Apr 2024
Multi-modal Deep Learning
Multi-modal Deep Learning
Chen Yuhua
MedIm
430
54
0
06 Mar 2024
Extreme Encoder Output Frame Rate Reduction: Improving Computational
  Latencies of Large End-to-End Models
Extreme Encoder Output Frame Rate Reduction: Improving Computational Latencies of Large End-to-End Models
Rohit Prabhavalkar
Zhong Meng
Weiran Wang
Adam Stooke
Xingyu Cai
Yanzhang He
Arun Narayanan
Dongseong Hwang
Tara N. Sainath
Pedro J. Moreno
249
11
0
27 Feb 2024
Multilingual and Fully Non-Autoregressive ASR with Large Language Model
  Fusion: A Comprehensive Study
Multilingual and Fully Non-Autoregressive ASR with Large Language Model Fusion: A Comprehensive StudyIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Wenjie Huang
Cyril Allauzen
Tongzhou Chen
Kilol Gupta
Ke Hu
James Qin
Yu Zhang
Yongqiang Wang
Shuo-yiin Chang
Tara N. Sainath
MoMe
307
19
0
23 Jan 2024
Efficient Adapter Finetuning for Tail Languages in Streaming
  Multilingual ASR
Efficient Adapter Finetuning for Tail Languages in Streaming Multilingual ASRIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Junwen Bai
Yue Liu
Qiujia Li
Tara N. Sainath
Trevor Strohman
381
8
0
17 Jan 2024
Findings of the 2023 ML-SUPERB Challenge: Pre-Training and Evaluation over More Languages and Beyond
Findings of the 2023 ML-SUPERB Challenge: Pre-Training and Evaluation over More Languages and BeyondAutomatic Speech Recognition & Understanding (ASRU), 2023
Jiatong Shi
William Chen
Dan Berrebbi
Hsiu-Hsuan Wang
Wei-Ping Huang
...
Yuxun Tang
Shang-Wen Li
Abdelrahman Mohamed
Hung-yi Lee
Shinji Watanabe
LRMELM
458
18
0
09 Oct 2023
UniverSLU: Universal Spoken Language Understanding for Diverse Tasks
  with Natural Language Instructions
UniverSLU: Universal Spoken Language Understanding for Diverse Tasks with Natural Language InstructionsNorth American Chapter of the Association for Computational Linguistics (NAACL), 2023
Siddhant Arora
Hayato Futami
Jee-weon Jung
Yifan Peng
Roshan S. Sharma
Yosuke Kashiwagi
E. Tsunoo
Karen Livescu
Shinji Watanabe
ELM
295
12
0
04 Oct 2023
SSHR: Leveraging Self-supervised Hierarchical Representations for
  Multilingual Automatic Speech Recognition
SSHR: Leveraging Self-supervised Hierarchical Representations for Multilingual Automatic Speech RecognitionIEEE International Conference on Multimedia and Expo (ICME), 2023
Hongfei Xue
Qijie Shao
Tommy Yuan
Peikun Chen
Jie Liu
Lei Xie
311
6
0
29 Sep 2023
Generative Speech Recognition Error Correction with Large Language
  Models and Task-Activating Prompting
Generative Speech Recognition Error Correction with Large Language Models and Task-Activating PromptingAutomatic Speech Recognition & Understanding (ASRU), 2023
Chao-Han Huck Yang
Yile Gu
Yi-Chieh Liu
Shalini Ghosh
I. Bulyko
A. Stolcke
KELMLRM
491
91
0
27 Sep 2023
Reproducing Whisper-Style Training Using an Open-Source Toolkit and
  Publicly Available Data
Reproducing Whisper-Style Training Using an Open-Source Toolkit and Publicly Available DataAutomatic Speech Recognition & Understanding (ASRU), 2023
Yifan Peng
Jinchuan Tian
Brian Yan
Dan Berrebbi
Xuankai Chang
...
Yui Sudo
Muhammad Shakeel
Jee-weon Jung
Soumi Maiti
Shinji Watanabe
VLM
410
72
0
25 Sep 2023
Improving vision-inspired keyword spotting using dynamic module skipping
  in streaming conformer encoder
Improving vision-inspired keyword spotting using dynamic module skipping in streaming conformer encoderIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Alexandre Bittar
Paul Dixon
Mohammad Samragh
K. Nishu
Devang Naik
317
7
0
31 Aug 2023
Cascaded encoders for fine-tuning ASR models on overlapped speech
Cascaded encoders for fine-tuning ASR models on overlapped speechInterspeech (Interspeech), 2023
R. Rose
Oscar Chang
Olivier Siohan
173
2
0
28 Jun 2023
Master-ASR: Achieving Multilingual Scalability and Low-Resource
  Adaptation in ASR with Modular Learning
Master-ASR: Achieving Multilingual Scalability and Low-Resource Adaptation in ASR with Modular LearningInternational Conference on Machine Learning (ICML), 2023
Zhongzhi Yu
Yang Zhang
Kaizhi Qian
Y. Fu
Yingyan Lin
300
17
0
23 Jun 2023
Unified model for code-switching speech recognition and language
  identification based on a concatenated tokenizer
Unified model for code-switching speech recognition and language identification based on a concatenated tokenizer
Kunal Dhawan
KDimating Rekesh
Boris Ginsburg
296
17
0
14 Jun 2023
Scaling Speech Technology to 1,000+ Languages
Scaling Speech Technology to 1,000+ LanguagesJournal of machine learning research (JMLR), 2023
Vineel Pratap
Andros Tjandra
Bowen Shi
Paden Tomasello
Arun Babu
...
Yossi Adi
Xiaohui Zhang
Wei-Ning Hsu
Alexis Conneau
Michael Auli
VLM
524
586
0
22 May 2023
Language-universal phonetic encoder for low-resource speech recognition
Language-universal phonetic encoder for low-resource speech recognitionInterspeech (Interspeech), 2023
Siyuan Feng
Ming Tu
Rui Xia
Chuanzeng Huang
Yuxuan Wang
265
3
0
19 May 2023
Language-Universal Phonetic Representation in Multilingual Speech
  Pretraining for Low-Resource Speech Recognition
Language-Universal Phonetic Representation in Multilingual Speech Pretraining for Low-Resource Speech RecognitionInterspeech (Interspeech), 2023
Siyuan Feng
Ming Tu
Rui Xia
Chuanzeng Huang
Yuxuan Wang
201
8
0
19 May 2023
End-to-End Speech Recognition: A Survey
End-to-End Speech Recognition: A SurveyIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023
Rohit Prabhavalkar
Takaaki Hori
Tara N. Sainath
Ralf Schluter
Shinji Watanabe
VLM
361
276
0
03 Mar 2023
Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages
Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages
Yu Zhang
Wei Han
James Qin
Yongqiang Wang
Ankur Bapna
...
Pedro J. Moreno
Chung-Cheng Chiu
J. Schalkwyk
Franccoise Beaufays
Yonghui Wu
VLM
533
370
0
02 Mar 2023
Improving Massively Multilingual ASR With Auxiliary CTC Objectives
Improving Massively Multilingual ASR With Auxiliary CTC ObjectivesIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
William Chen
Brian Yan
Jiatong Shi
Yifan Peng
Soumi Maiti
Shinji Watanabe
348
53
0
24 Feb 2023
UML: A Universal Monolingual Output Layer for Multilingual ASR
UML: A Universal Monolingual Output Layer for Multilingual ASRIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Chaoyang Zhang
Yue Liu
Tara N. Sainath
Trevor Strohman
Shuo-yiin Chang
295
7
0
22 Feb 2023
Efficient Domain Adaptation for Speech Foundation Models
Efficient Domain Adaptation for Speech Foundation ModelsIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Yue Liu
DongSeon Hwang
Zhouyuan Huo
Junwen Bai
Guru Prakash
...
K. Sim
Yu Zhang
Wei Han
Trevor Strohman
F. Beaufays
AI4CE
321
30
0
03 Feb 2023
From English to More Languages: Parameter-Efficient Model Reprogramming
  for Cross-Lingual Speech Recognition
From English to More Languages: Parameter-Efficient Model Reprogramming for Cross-Lingual Speech RecognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Chao-Han Huck Yang
Yue Liu
Yu Zhang
Nanxin Chen
Rohit Prabhavalkar
Tara N. Sainath
Trevor Strohman
226
33
0
19 Jan 2023
Improved Self-Supervised Multilingual Speech Representation Learning
  Combined with Auxiliary Language Information
Improved Self-Supervised Multilingual Speech Representation Learning Combined with Auxiliary Language Information
Fenglin Ding
Genshun Wan
Pengcheng Li
Jia Pan
Cong Liu
SSL
319
1
0
07 Dec 2022
Massively Multilingual ASR on 70 Languages: Tokenization, Architecture,
  and Generalization Capabilities
Massively Multilingual ASR on 70 Languages: Tokenization, Architecture, and Generalization CapabilitiesIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Andros Tjandra
Nayan Singhal
David C. Zhang
Ozlem Kalinli
Abdel-rahman Mohamed
Duc Le
M. Seltzer
263
21
0
10 Nov 2022
Towards Zero-Shot Code-Switched Speech Recognition
Towards Zero-Shot Code-Switched Speech RecognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Brian Yan
Sanjeev Khudanpur
Ondˇrej Klejch
Preethi Jyothi
Shinji Watanabe
297
24
0
02 Nov 2022
Scaling Up Deliberation for Multilingual ASR
Scaling Up Deliberation for Multilingual ASRSpoken Language Technology Workshop (SLT), 2022
Ke Hu
Yue Liu
Tara N. Sainath
LRM
344
11
0
11 Oct 2022
Streaming End-to-End Multilingual Speech Recognition with Joint Language
  Identification
Streaming End-to-End Multilingual Speech Recognition with Joint Language IdentificationInterspeech (Interspeech), 2022
Chuxu Zhang
Yue Liu
Tara N. Sainath
Trevor Strohman
S. Mavandadi
Shuo-yiin Chang
Parisa Haghani
321
35
0
13 Sep 2022
Learning ASR pathways: A sparse multilingual ASR model
Learning ASR pathways: A sparse multilingual ASR modelIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Mu Yang
Andros Tjandra
Chunxi Liu
David C. Zhang
Duc Le
Ozlem Kalinli
482
14
0
13 Sep 2022
A Language Agnostic Multilingual Streaming On-Device ASR System
A Language Agnostic Multilingual Streaming On-Device ASR SystemInterspeech (Interspeech), 2022
Yue Liu
Tara N. Sainath
Ruoming Pang
Shuo-yiin Chang
Qiumin Xu
...
Qiao Liang
Heguang Liu
Yanzhang He
Parisa Haghani
Sameer Bidichandani
AuLLM
223
14
0
29 Aug 2022
FLEURS: Few-shot Learning Evaluation of Universal Representations of
  Speech
FLEURS: Few-shot Learning Evaluation of Universal Representations of SpeechSpoken Language Technology Workshop (SLT), 2022
Alexis Conneau
Min Ma
Simran Khanuja
Yu Zhang
Vera Axelrod
Siddharth Dalmia
Jason Riesa
Clara E. Rivera
Ankur Bapna
VLM
570
548
0
25 May 2022
Combining Spectral and Self-Supervised Features for Low Resource Speech
  Recognition and Translation
Combining Spectral and Self-Supervised Features for Low Resource Speech Recognition and TranslationInterspeech (Interspeech), 2022
Dan Berrebbi
Jiatong Shi
Brian Yan
Osbel López-Francisco
Jonathan D. Amith
Shinji Watanabe
258
32
0
05 Apr 2022
Language Adaptive Cross-lingual Speech Representation Learning with
  Sparse Sharing Sub-networks
Language Adaptive Cross-lingual Speech Representation Learning with Sparse Sharing Sub-networksIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Yizhou Lu
Mingkun Huang
Xinghua Qu
Pengfei Wei
Zejun Ma
307
21
0
09 Mar 2022
Self-supervised Learning with Random-projection Quantizer for Speech
  Recognition
Self-supervised Learning with Random-projection Quantizer for Speech RecognitionInternational Conference on Machine Learning (ICML), 2022
Chung-Cheng Chiu
James Qin
Yu Zhang
Jiahui Yu
Yonghui Wu
SSL
354
237
0
03 Feb 2022
Improving the fusion of acoustic and text representations in RNN-T
Improving the fusion of acoustic and text representations in RNN-TIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Chao Zhang
Yue Liu
Zhiyun Lu
Tara N. Sainath
Shuo-yiin Chang
AI4CE
347
14
0
25 Jan 2022
Building a great multi-lingual teacher with sparsely-gated mixture of
  experts for speech recognition
Building a great multi-lingual teacher with sparsely-gated mixture of experts for speech recognition
K. Kumatani
R. Gmyr
Andres Felipe Cruz Salinas
Linquan Liu
Wei Zuo
Devang Patel
Eric Sun
Yu Shi
MoE
369
22
0
10 Dec 2021
XLS-R: Self-supervised Cross-lingual Speech Representation Learning at
  Scale
XLS-R: Self-supervised Cross-lingual Speech Representation Learning at Scale
Arun Babu
Changhan Wang
Andros Tjandra
Kushal Lakhotia
Qiantong Xu
...
Yatharth Saraf
J. Pino
Alexei Baevski
Alexis Conneau
Michael Auli
SSL
567
982
0
17 Nov 2021
12
Next
Page 1 of 2