Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1810.03459
Cited By
Multilingual sequence-to-sequence speech recognition: architecture, transfer learning, and language modeling
4 October 2018
Jaejin Cho
M. Baskar
Ruizhi Li
Sanjeev Khudanpur
Sri Harish Reddy Mallidi
Nelson Yalta
M. Karafiát
Shinji Watanabe
Takaaki Hori
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Multilingual sequence-to-sequence speech recognition: architecture, transfer learning, and language modeling"
50 / 59 papers shown
Flavors of Moonshine: Tiny Specialized ASR Models for Edge Devices
Evan King
Adam Sabra
M. Kudlur
James Wang
Pete Warden
82
0
0
02 Sep 2025
Speech Prefix-Tuning with RNNT Loss for Improving LLM Predictions
M. Baskar
Andrew Rosenberg
Bhuvana Ramabhadran
Neeraj Gaur
Zhong Meng
181
3
0
20 Jun 2024
GigaSpeech 2: An Evolving, Large-Scale and Multi-domain ASR Corpus for Low-Resource Languages with Automated Crawling, Transcription and Refinement
Yifan Yang
Zheshu Song
Jianheng Zhuo
Mingyu Cui
Jinpeng Li
...
Shuai Fan
Kai Yu
Wei Zhang
Guoguo Chen
Xie Chen
509
34
0
17 Jun 2024
Wav2Gloss: Generating Interlinear Glossed Text from Speech
Taiqi He
Kwanghee Choi
Lindia Tjuatja
Nathaniel R. Robinson
Jiatong Shi
Shinji Watanabe
Graham Neubig
David R. Mortensen
Lori S. Levin
VLM
210
5
0
19 Mar 2024
Automatic Speech Recognition using Advanced Deep Learning Approaches: A survey
Hamza Kheddar
Mustapha Hemis
Yassine Himeur
OffRL
259
138
0
02 Mar 2024
A Quantitative Approach to Understand Self-Supervised Models as Cross-lingual Feature Extractors
International Conference on Natural Language and Speech Processing (ICNLSP), 2023
Shuyue Stella Li
Beining Xu
Xiangyu Zhang
Hexin Liu
Wen-Han Chao
Leibny Paola García
SSL
155
4
0
27 Nov 2023
Multilingual Contextual Adapters To Improve Custom Word Recognition In Low-resource Languages
Interspeech (Interspeech), 2023
Devang Kulshreshtha
Saket Dingliwal
Brady C. Houston
S. Bodapati
217
6
0
03 Jul 2023
Cross-lingual Knowledge Transfer and Iterative Pseudo-labeling for Low-Resource Speech Recognition with Transducers
J. Silovský
Liuhui Deng
Arturo Argueta
Tresi Arvizo
Roger Hsiao
Sasha Kuznietsov
Yiu-Chang Lin
Xiaoqiang Xiao
Yuanyuan Zhang
197
3
0
23 May 2023
Scaling Speech Technology to 1,000+ Languages
Journal of machine learning research (JMLR), 2023
Vineel Pratap
Andros Tjandra
Bowen Shi
Paden Tomasello
Arun Babu
...
Yossi Adi
Xiaohui Zhang
Wei-Ning Hsu
Alexis Conneau
Michael Auli
VLM
389
515
0
22 May 2023
DinoSR: Self-Distillation and Online Clustering for Self-supervised Speech Representation Learning
Neural Information Processing Systems (NeurIPS), 2023
Alexander H. Liu
Heng-Jui Chang
Michael Auli
Wei-Ning Hsu
James R. Glass
464
36
0
17 May 2023
Deep Transfer Learning for Automatic Speech Recognition: Towards Better Generalization
Knowledge-Based Systems (KBS), 2023
Hamza Kheddar
Yassine Himeur
S. Al-Maadeed
Abbes Amira
F. Bensaali
297
117
0
27 Apr 2023
Deep representation learning: Fundamentals, Perspectives, Applications, and Open Challenges
K. T. Baghaei
Amirreza Payandeh
Pooya Fayyazsanavi
Shahram Rahimi
Zhiqian Chen
Somayeh Bakhtiari Ramezani
FaML
AI4TS
215
10
0
27 Nov 2022
Maestro-U: Leveraging joint speech-text representation learning for zero supervised speech ASR
Spoken Language Technology Workshop (SLT), 2022
Zhehuai Chen
Ankur Bapna
Andrew Rosenberg
Yu Zhang
Bhuvana Ramabhadran
Pedro J. Moreno
Nanxin Chen
237
17
0
18 Oct 2022
Streaming End-to-End Multilingual Speech Recognition with Joint Language Identification
Interspeech (Interspeech), 2022
Chuxu Zhang
Yue Liu
Tara N. Sainath
Trevor Strohman
S. Mavandadi
Shuo-yiin Chang
Parisa Haghani
281
34
0
13 Sep 2022
End-to-End Spoken Language Understanding: Performance analyses of a voice command task in a low resource setting
Computer Speech and Language (CSL), 2022
Thierry Desot
François Portet
Michel Vacher
103
15
0
17 Jul 2022
Non-Linear Pairwise Language Mappings for Low-Resource Multilingual Acoustic Model Fusion
Interspeech (Interspeech), 2022
Muhammad Umar Farooq
Darshan Adiga Haniya Narayana
Thomas Hain
121
2
0
07 Jul 2022
Combining Spectral and Self-Supervised Features for Low Resource Speech Recognition and Translation
Interspeech (Interspeech), 2022
Dan Berrebbi
Jiatong Shi
Brian Yan
Osbel López-Francisco
Jonathan D. Amith
Shinji Watanabe
206
30
0
05 Apr 2022
Curriculum optimization for low-resource speech recognition
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Anastasia Kuznetsova
Anurag Kumar
Jennifer Drexler Fox
Francis M. Tyers
130
3
0
17 Feb 2022
Cascaded Multilingual Audio-Visual Learning from Videos
Andrew Rouditchenko
Angie Boggust
David Harwath
Samuel Thomas
Hilde Kuehne
...
Yikang Shen
Rogerio Feris
Brian Kingsbury
M. Picheny
James R. Glass
529
8
0
08 Nov 2021
Recent Advances in End-to-End Automatic Speech Recognition
APSIPA Transactions on Signal and Information Processing (TASIP), 2021
Jinyu Li
VLM
424
425
0
02 Nov 2021
Pseudo-Labeling for Massively Multilingual Speech Recognition
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021
Loren Lugosch
Tatiana Likhomanenko
Gabriel Synnaeve
R. Collobert
VLM
297
34
0
30 Oct 2021
Wav-BERT: Cooperative Acoustic and Linguistic Representation Learning for Low-Resource Speech Recognition
Guolin Zheng
Yubei Xiao
Ke Gong
Pan Zhou
Xiaodan Liang
Liang Lin
194
27
0
19 Sep 2021
Coarse-To-Fine And Cross-Lingual ASR Transfer
Peter Polák
Ondrej Bojar
91
3
0
02 Sep 2021
A Study of Multilingual End-to-End Speech Recognition for Kazakh, Russian, and English
Saida Mussakhojayeva
Yerbolat Khassanov
H. A. Varol
153
19
0
03 Aug 2021
Improved Language Identification Through Cross-Lingual Self-Supervised Learning
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021
Andros Tjandra
Diptanu Gon Choudhury
Frank Zhang
Kritika Singh
Alexis Conneau
Alexei Baevski
Assaf Sela
Yatharth Saraf
Michael Auli
VLM
SSL
182
37
0
08 Jul 2021
Overcoming Domain Mismatch in Low Resource Sequence-to-Sequence ASR Models using Hybrid Generated Pseudotranscripts
Chak-Fai Li
Francis Keith
William Hartmann
M. Snover
O. Kimball
157
4
0
14 Jun 2021
PARP: Prune, Adjust and Re-Prune for Self-Supervised Speech Recognition
Neural Information Processing Systems (NeurIPS), 2021
Cheng-I Jeff Lai
Yang Zhang
Alexander H. Liu
Shiyu Chang
Yi-Lun Liao
Yung-Sung Chuang
Kaizhi Qian
Sameer Khurana
David D. Cox
James R. Glass
VLM
295
86
0
10 Jun 2021
Towards One Model to Rule All: Multilingual Strategy for Dialectal Code-Switching Arabic ASR
Interspeech (Interspeech), 2021
Shammur A. Chowdhury
A. Hussein
Ahmed Abdelali
Ahmed M. Ali
265
48
0
31 May 2021
Exploiting Adapters for Cross-lingual Low-resource Speech Recognition
IEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2021
Wenxin Hou
Hanlin Zhu
Yidong Wang
Yongfeng Zhang
Tao Qin
Renjun Xu
T. Shinozaki
234
73
0
18 May 2021
XLST: Cross-lingual Self-training to Learn Multilingual Representation for Low Resource Speech Recognition
Zi-qiang Zhang
Yan Song
Ming Wu
Xin Fang
Lirong Dai
SSL
116
22
0
15 Mar 2021
Dynamic Acoustic Unit Augmentation With BPE-Dropout for Low-Resource End-to-End Speech Recognition
Italian National Conference on Sensors (INS), 2021
A. Laptev
A. Andrusenko
Ivan Podluzhny
Anton Mitrofanov
Ivan Medennikov
Yuri N. Matveev
VLM
129
15
0
12 Mar 2021
End-to-end acoustic modelling for phone recognition of young readers
Speech Communication (Speech Commun.), 2021
Lucile Gelin
Morgane Daniel
J. Pinquier
Thomas Pellegrini
184
17
0
04 Mar 2021
Train your classifier first: Cascade Neural Networks Training from upper layers to lower layers
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021
Shucong Zhang
Cong-Thanh Do
R. Doddipatla
Erfan Loweimi
P. Bell
Steve Renals
236
2
0
09 Feb 2021
Two-Stage Augmentation and Adaptive CTC Fusion for Improved Robustness of Multi-Stream End-to-End ASR
Spoken Language Technology Workshop (SLT), 2021
Ruizhi Li
Gregory Sell
H. Hermansky
140
2
0
05 Feb 2021
Adversarial Meta Sampling for Multilingual Low-Resource Speech Recognition
AAAI Conference on Artificial Intelligence (AAAI), 2020
Yubei Xiao
Ke Gong
Pan Zhou
Guolin Zheng
Xiaodan Liang
Liang Lin
201
35
0
22 Dec 2020
Transformer-Transducers for Code-Switched Speech Recognition
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2020
Siddharth Dalmia
Yuzong Liu
S. Ronanki
Katrin Kirchhoff
222
50
0
30 Nov 2020
Bootstrap an end-to-end ASR system by multilingual training, transfer learning, text-to-text mapping and synthetic audio
Interspeech (Interspeech), 2020
Manuel Giollo
Deniz Gunceler
Yulan Liu
D. Willett
160
11
0
25 Nov 2020
Autosegmental Neural Nets: Should Phones and Tones be Synchronous or Asynchronous?
Interspeech (Interspeech), 2020
Jialu Li
M. Hasegawa-Johnson
173
5
0
28 Jul 2020
Massively Multilingual ASR: 50 Languages, 1 Model, 1 Billion Parameters
Vineel Pratap
Anuroop Sriram
Paden Tomasello
Awni Y. Hannun
Vitaliy Liptchinsky
Gabriel Synnaeve
R. Collobert
254
153
0
06 Jul 2020
Unsupervised Cross-lingual Representation Learning for Speech Recognition
Interspeech (Interspeech), 2020
Alexis Conneau
Alexei Baevski
R. Collobert
Abdel-rahman Mohamed
Michael Auli
SSL
362
919
0
24 Jun 2020
Improving Cross-Lingual Transfer Learning for End-to-End Speech Recognition with Speech Translation
Changhan Wang
J. Pino
Jiatao Gu
148
33
0
09 Jun 2020
Fusion Recurrent Neural Network
Yiwen Sun
Yulu Wang
Kun Fu
Zheng Wang
Changshui Zhang
Jieping Ye
92
1
0
07 Jun 2020
Improved acoustic word embeddings for zero-resource languages using multilingual transfer
IEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2020
Herman Kamper
Yevgen Matusevych
Sharon Goldwater
242
21
0
02 Jun 2020
An End-to-End Mispronunciation Detection System for L2 English Speech Leveraging Novel Anti-Phone Modeling
Interspeech (Interspeech), 2020
Bi-Cheng Yan
Meng-Che Wu
Hsiao-Tsung Hung
Berlin Chen
138
48
0
25 May 2020
DARTS-ASR: Differentiable Architecture Search for Multilingual Speech Recognition and Adaptation
Yi-Chen Chen
Jui-Yang Hsu
Cheng-Kuang Lee
Hung-yi Lee
197
33
0
13 May 2020
Speech Corpus of Ainu Folklore and End-to-end Speech Recognition for Ainu Language
International Conference on Language Resources and Evaluation (LREC), 2020
Kohei Matsuura
Sei Ueno
Masato Mimura
S. Sakai
Tatsuya Kawahara
CVBM
175
15
0
16 Feb 2020
Multilingual acoustic word embedding models for processing zero-resource languages
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2020
Herman Kamper
Yevgen Matusevych
Sharon Goldwater
272
25
0
06 Feb 2020
Meta Learning for End-to-End Low-Resource Speech Recognition
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2019
Jui-Yang Hsu
Yuan-Jui Chen
Hung-yi Lee
116
114
0
26 Oct 2019
Analyzing ASR pretraining for low-resource speech-to-text translation
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2019
Mihaela C. Stoian
Sameer Bansal
Sharon Goldwater
227
71
0
23 Oct 2019
A practical two-stage training strategy for multi-stream end-to-end speech recognition
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2019
Ruizhi Li
Gregory Sell
Xiaofei Wang
Shinji Watanabe
H. Hermansky
99
7
0
23 Oct 2019
1
2
Next