Multilingual sequence-to-sequence speech recognition: architecture, transfer learning, and language modeling

4 October 2018

Sri Harish Reddy Mallidi

Papers citing "Multilingual sequence-to-sequence speech recognition: architecture, transfer learning, and language modeling"

50 / 59 papers shown

Flavors of Moonshine: Tiny Specialized ASR Models for Edge Devices

02 Sep 2025

Speech Prefix-Tuning with RNNT Loss for Improving LLM Predictions

Bhuvana Ramabhadran

181

20 Jun 2024

GigaSpeech 2: An Evolving, Large-Scale and Multi-domain ASR Corpus for Low-Resource Languages with Automated Crawling, Transcription and Refinement

...

509

17 Jun 2024

Wav2Gloss: Generating Interlinear Glossed Text from Speech

Taiqi He

Kwanghee Choi

Lindia Tjuatja

Nathaniel R. Robinson

Jiatong Shi

Shinji Watanabe

Graham Neubig

David R. Mortensen

Lori S. Levin

VLM

210

19 Mar 2024

Automatic Speech Recognition using Advanced Deep Learning Approaches: A survey

Hamza Kheddar

Mustapha Hemis

Yassine Himeur

OffRL

259

138

02 Mar 2024

A Quantitative Approach to Understand Self-Supervised Models as Cross-lingual Feature ExtractorsInternational Conference on Natural Language and Speech Processing (ICNLSP), 2023

Xiangyu Zhang

155

27 Nov 2023

Multilingual Contextual Adapters To Improve Custom Word Recognition In Low-resource LanguagesInterspeech (Interspeech), 2023

217

03 Jul 2023

Cross-lingual Knowledge Transfer and Iterative Pseudo-labeling for Low-Resource Speech Recognition with Transducers

197

23 May 2023

Scaling Speech Technology to 1,000+ LanguagesJournal of machine learning research (JMLR), 2023

...

Yossi Adi

389

515

22 May 2023

DinoSR: Self-Distillation and Online Clustering for Self-supervised Speech Representation LearningNeural Information Processing Systems (NeurIPS), 2023

464

17 May 2023

Deep Transfer Learning for Automatic Speech Recognition: Towards Better GeneralizationKnowledge-Based Systems (KBS), 2023

Hamza Kheddar

297

117

27 Apr 2023

Deep representation learning: Fundamentals, Perspectives, Applications, and Open Challenges

Somayeh Bakhtiari Ramezani

FaML AI4TS

215

27 Nov 2022

Maestro-U: Leveraging joint speech-text representation learning for zero supervised speech ASRSpoken Language Technology Workshop (SLT), 2022

Zhehuai Chen

Andrew Rosenberg

Bhuvana Ramabhadran

237

18 Oct 2022

Streaming End-to-End Multilingual Speech Recognition with Joint Language IdentificationInterspeech (Interspeech), 2022

281

13 Sep 2022

End-to-End Spoken Language Understanding: Performance analyses of a voice command task in a low resource settingComputer Speech and Language (CSL), 2022

Thierry Desot

François Portet

Michel Vacher

103

17 Jul 2022

Non-Linear Pairwise Language Mappings for Low-Resource Multilingual Acoustic Model FusionInterspeech (Interspeech), 2022

Muhammad Umar Farooq

Darshan Adiga Haniya Narayana

Thomas Hain

121

07 Jul 2022

Combining Spectral and Self-Supervised Features for Low Resource Speech Recognition and TranslationInterspeech (Interspeech), 2022

Dan Berrebbi

Jiatong Shi

Brian Yan

Osbel López-Francisco

Jonathan D. Amith

Shinji Watanabe

206

05 Apr 2022

Curriculum optimization for low-resource speech recognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

130

17 Feb 2022

Cascaded Multilingual Audio-Visual Learning from Videos

...

529

08 Nov 2021

Recent Advances in End-to-End Automatic Speech RecognitionAPSIPA Transactions on Signal and Information Processing (TASIP), 2021

Jinyu Li

VLM

424

425

02 Nov 2021

Pseudo-Labeling for Massively Multilingual Speech RecognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021

297

30 Oct 2021

Wav-BERT: Cooperative Acoustic and Linguistic Representation Learning for Low-Resource Speech Recognition

Xiaodan Liang

194

19 Sep 2021

Coarse-To-Fine And Cross-Lingual ASR Transfer

Peter Polák

Ondrej Bojar

02 Sep 2021

A Study of Multilingual End-to-End Speech Recognition for Kazakh, Russian, and English

Saida Mussakhojayeva

Yerbolat Khassanov

H. A. Varol

153

03 Aug 2021

Improved Language Identification Through Cross-Lingual Self-Supervised LearningIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021

Andros Tjandra

Diptanu Gon Choudhury

182

08 Jul 2021

Overcoming Domain Mismatch in Low Resource Sequence-to-Sequence ASR Models using Hybrid Generated Pseudotranscripts

157

14 Jun 2021

PARP: Prune, Adjust and Re-Prune for Self-Supervised Speech RecognitionNeural Information Processing Systems (NeurIPS), 2021

Kaizhi Qian

295

10 Jun 2021

Towards One Model to Rule All: Multilingual Strategy for Dialectal Code-Switching Arabic ASRInterspeech (Interspeech), 2021

265

31 May 2021

Exploiting Adapters for Cross-lingual Low-resource Speech RecognitionIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2021

234

18 May 2021

XLST: Cross-lingual Self-training to Learn Multilingual Representation for Low Resource Speech Recognition

Yan Song

116

15 Mar 2021

Dynamic Acoustic Unit Augmentation With BPE-Dropout for Low-Resource End-to-End Speech RecognitionItalian National Conference on Sensors (INS), 2021

Ivan Medennikov

129

12 Mar 2021

End-to-end acoustic modelling for phone recognition of young readersSpeech Communication (Speech Commun.), 2021

184

04 Mar 2021

Train your classifier first: Cascade Neural Networks Training from upper layers to lower layersIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021

Shucong Zhang

236

09 Feb 2021

Two-Stage Augmentation and Adaptive CTC Fusion for Improved Robustness of Multi-Stream End-to-End ASRSpoken Language Technology Workshop (SLT), 2021

Ruizhi Li

Gregory Sell

H. Hermansky

140

05 Feb 2021

Adversarial Meta Sampling for Multilingual Low-Resource Speech RecognitionAAAI Conference on Artificial Intelligence (AAAI), 2020

Xiaodan Liang

201

22 Dec 2020

Transformer-Transducers for Code-Switched Speech RecognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2020

222

30 Nov 2020

Bootstrap an end-to-end ASR system by multilingual training, transfer learning, text-to-text mapping and synthetic audioInterspeech (Interspeech), 2020

160

25 Nov 2020

Autosegmental Neural Nets: Should Phones and Tones be Synchronous or Asynchronous?Interspeech (Interspeech), 2020

Jialu Li

M. Hasegawa-Johnson

173

28 Jul 2020

Massively Multilingual ASR: 50 Languages, 1 Model, 1 Billion Parameters

254

153

06 Jul 2020

Unsupervised Cross-lingual Representation Learning for Speech RecognitionInterspeech (Interspeech), 2020

362

919

24 Jun 2020

Improving Cross-Lingual Transfer Learning for End-to-End Speech Recognition with Speech Translation

Changhan Wang

J. Pino

Jiatao Gu

148

09 Jun 2020

Fusion Recurrent Neural Network

07 Jun 2020

Improved acoustic word embeddings for zero-resource languages using multilingual transferIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2020

Herman Kamper

Yevgen Matusevych

Sharon Goldwater

242

02 Jun 2020

An End-to-End Mispronunciation Detection System for L2 English Speech Leveraging Novel Anti-Phone ModelingInterspeech (Interspeech), 2020

Bi-Cheng Yan

Meng-Che Wu

Hsiao-Tsung Hung

Berlin Chen

138

25 May 2020

DARTS-ASR: Differentiable Architecture Search for Multilingual Speech Recognition and Adaptation

197

13 May 2020

Speech Corpus of Ainu Folklore and End-to-end Speech Recognition for Ainu LanguageInternational Conference on Language Resources and Evaluation (LREC), 2020

175

16 Feb 2020

Multilingual acoustic word embedding models for processing zero-resource languagesIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2020

Herman Kamper

Yevgen Matusevych

Sharon Goldwater

272

06 Feb 2020

Meta Learning for End-to-End Low-Resource Speech RecognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2019

Jui-Yang Hsu

Yuan-Jui Chen

Hung-yi Lee

116

114

26 Oct 2019

Analyzing ASR pretraining for low-resource speech-to-text translationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2019

Mihaela C. Stoian

Sameer Bansal

Sharon Goldwater

227

23 Oct 2019

A practical two-stage training strategy for multi-stream end-to-end speech recognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2019

23 Oct 2019