v1v2 (latest)

Scaling End-to-End Models for Large-Scale Multilingual ASR

Automatic Speech Recognition & Understanding (ASRU), 2021

30 April 2021

Papers citing "Scaling End-to-End Models for Large-Scale Multilingual ASR"

50 / 61 papers shown

MLMA: Towards Multilingual ASR With Mamba-based Architectures

322

21 Oct 2025

Long Chain-of-Thought Reasoning Across Languages

248

20 Aug 2025

GigaAM: Efficient Self-Supervised Learner for Speech Recognition

292

01 Jun 2025

Whisper-LM: Improving ASR Models with Language Models for Low-Resource Languages

382

30 Mar 2025

LUPET: Incorporating Hierarchical Information Path into Multilingual ASRInterspeech (Interspeech), 2024

620

10 Jan 2025

A two-stage transliteration approach to improve performance of a multilingual ASR

Rohit Kumar

218

09 Oct 2024

Exploring SSL Discrete Tokens for Multilingual ASR

Mingyu Cui

Daxin Tan

Yifan Yang

Dingdong Wang

Huimeng Wang

Xiao Chen

Xie Chen

Xunying Liu

347

13 Sep 2024

Learn and Don't Forget: Adding a New Language to ASR Foundation Models

366

09 Jul 2024

GigaSpeech 2: An Evolving, Large-Scale and Multi-domain ASR Corpus for Low-Resource Languages with Automated Crawling, Transcription and Refinement

...

649

17 Jun 2024

Dual-Pipeline with Low-Rank Adaptation for New Language Integration in Multilingual ASR

Yerbolat Khassanov

Zhipeng Chen

Tianfeng Chen

Tze Yuang Chong

Wei Li

Jun Zhang

Lu Lu

Yuxuan Wang

AI4CE

267

12 Jun 2024

A Parameter-efficient Language Extension Framework for Multilingual ASR

Wei Liu

Tan Lee

333

10 Jun 2024

USM RNN-T model weights binarization

Oleg Rybakov

Dmitriy Serdyuk

Chengjian Zheng

360

05 Jun 2024

Whistle: Data-Efficient Multilingual and Crosslingual Speech Recognition via Weakly Phonetic Supervision

424

04 Jun 2024

Exploring neural oscillations during speech perception via surrogate gradient spiking neural networks

Alexandre Bittar

Philip N. Garner

199

22 Apr 2024

Multi-modal Deep Learning

Chen Yuhua

MedIm

430

06 Mar 2024

Extreme Encoder Output Frame Rate Reduction: Improving Computational Latencies of Large End-to-End Models

249

27 Feb 2024

Multilingual and Fully Non-Autoregressive ASR with Large Language Model Fusion: A Comprehensive StudyIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024

307

23 Jan 2024

Efficient Adapter Finetuning for Tail Languages in Streaming Multilingual ASRIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024

381

17 Jan 2024

Findings of the 2023 ML-SUPERB Challenge: Pre-Training and Evaluation over More Languages and BeyondAutomatic Speech Recognition & Understanding (ASRU), 2023

Jiatong Shi

...

Yuxun Tang

Shang-Wen Li

Abdelrahman Mohamed

Hung-yi Lee

Shinji Watanabe

LRM ELM

458

09 Oct 2023

UniverSLU: Universal Spoken Language Understanding for Diverse Tasks with Natural Language InstructionsNorth American Chapter of the Association for Computational Linguistics (NAACL), 2023

Shinji Watanabe

295

04 Oct 2023

SSHR: Leveraging Self-supervised Hierarchical Representations for Multilingual Automatic Speech RecognitionIEEE International Conference on Multimedia and Expo (ICME), 2023

Hongfei Xue

Qijie Shao

Tommy Yuan

Peikun Chen

Jie Liu

Lei Xie

311

29 Sep 2023

Generative Speech Recognition Error Correction with Large Language Models and Task-Activating PromptingAutomatic Speech Recognition & Understanding (ASRU), 2023

491

27 Sep 2023

Reproducing Whisper-Style Training Using an Open-Source Toolkit and Publicly Available DataAutomatic Speech Recognition & Understanding (ASRU), 2023

...

410

25 Sep 2023

Improving vision-inspired keyword spotting using dynamic module skipping in streaming conformer encoderIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

317

31 Aug 2023

Cascaded encoders for fine-tuning ASR models on overlapped speechInterspeech (Interspeech), 2023

R. Rose

Oscar Chang

Olivier Siohan

173

28 Jun 2023

Master-ASR: Achieving Multilingual Scalability and Low-Resource Adaptation in ASR with Modular LearningInternational Conference on Machine Learning (ICML), 2023

Kaizhi Qian

300

23 Jun 2023

Unified model for code-switching speech recognition and language identification based on a concatenated tokenizer

Kunal Dhawan

KDimating Rekesh

Boris Ginsburg

296

14 Jun 2023

Scaling Speech Technology to 1,000+ LanguagesJournal of machine learning research (JMLR), 2023

...

Yossi Adi

524

586

22 May 2023

Language-universal phonetic encoder for low-resource speech recognitionInterspeech (Interspeech), 2023

Yuxuan Wang

265

19 May 2023

Language-Universal Phonetic Representation in Multilingual Speech Pretraining for Low-Resource Speech RecognitionInterspeech (Interspeech), 2023

Yuxuan Wang

201

19 May 2023

End-to-End Speech Recognition: A SurveyIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023

361

276

03 Mar 2023

Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages

...

533

370

02 Mar 2023

Improving Massively Multilingual ASR With Auxiliary CTC ObjectivesIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

Jiatong Shi

348

24 Feb 2023

UML: A Universal Monolingual Output Layer for Multilingual ASRIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

295

22 Feb 2023

Efficient Domain Adaptation for Speech Foundation ModelsIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

...

321

03 Feb 2023

From English to More Languages: Parameter-Efficient Model Reprogramming for Cross-Lingual Speech RecognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

226

19 Jan 2023

Improved Self-Supervised Multilingual Speech Representation Learning Combined with Auxiliary Language Information

319

07 Dec 2022

Massively Multilingual ASR on 70 Languages: Tokenization, Architecture, and Generalization CapabilitiesIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

Ozlem Kalinli

263

10 Nov 2022

Towards Zero-Shot Code-Switched Speech RecognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

297

02 Nov 2022

Scaling Up Deliberation for Multilingual ASRSpoken Language Technology Workshop (SLT), 2022

344

11 Oct 2022

Streaming End-to-End Multilingual Speech Recognition with Joint Language IdentificationInterspeech (Interspeech), 2022

321

13 Sep 2022

Learning ASR pathways: A sparse multilingual ASR modelIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

Ozlem Kalinli

482

13 Sep 2022

A Language Agnostic Multilingual Streaming On-Device ASR SystemInterspeech (Interspeech), 2022

...

223

29 Aug 2022

FLEURS: Few-shot Learning Evaluation of Universal Representations of SpeechSpoken Language Technology Workshop (SLT), 2022

570

548

25 May 2022

Combining Spectral and Self-Supervised Features for Low Resource Speech Recognition and TranslationInterspeech (Interspeech), 2022

Dan Berrebbi

Jiatong Shi

Brian Yan

Osbel López-Francisco

Jonathan D. Amith

Shinji Watanabe

258

05 Apr 2022

Language Adaptive Cross-lingual Speech Representation Learning with Sparse Sharing Sub-networksIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

307

09 Mar 2022

Self-supervised Learning with Random-projection Quantizer for Speech RecognitionInternational Conference on Machine Learning (ICML), 2022

354

237

03 Feb 2022

Improving the fusion of acoustic and text representations in RNN-TIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

Chao Zhang

347

25 Jan 2022

Building a great multi-lingual teacher with sparsely-gated mixture of experts for speech recognition

K. Kumatani

R. Gmyr

Andres Felipe Cruz Salinas

369

10 Dec 2021

XLS-R: Self-supervised Cross-lingual Speech Representation Learning at Scale

...

567

982

17 Nov 2021