Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2102.01757
Cited By
v1
v2 (latest)
The Multilingual TEDx Corpus for Speech Recognition and Translation
Interspeech (Interspeech), 2021
2 February 2021
Elizabeth Salesky
Sanjeev Khudanpur
Jacob Bremerman
R. Cattoni
Matteo Negri
Marco Turchi
Douglas W. Oard
Matt Post
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"The Multilingual TEDx Corpus for Speech Recognition and Translation"
50 / 76 papers shown
Title
Fast Neural Tangent Kernel Alignment, Norm and Effective Rank via Trace Estimation
James Hazelden
64
0
0
13 Nov 2025
Whisper-UT: A Unified Translation Framework for Speech and Text
Cihan Xiao
Matthew Wiesner
Debashish Chakraborty
Reno Kriz
Keith Cunningham
Kenton W. Murray
Kevin Duh
Luis Tavarez-Arce
Paul McNamee
Sanjeev Khudanpur
80
0
0
19 Sep 2025
SENSE models: an open source solution for multilingual and multimodal semantic-based tasks
Salima Mdhaffar
Haroun Elleuch
Chaimae Chellaf
H. Nguyen
Yannick Esteve
VLM
112
0
0
15 Sep 2025
NTU Speechlab LLM-Based Multilingual ASR System for Interspeech MLC-SLM Challenge 2025
Yizhou Peng
Bin Wang
Yi-Wen Chao
Ziyang Ma
Haoyang Zhang
Hexin Liu
Xie Chen
Eng Siong Chng
ELM
205
1
0
16 Jun 2025
Voice Conversion Improves Cross-Domain Robustness for Spoken Arabic Dialect Identification
Badr M. Abdullah
Matthew Baas
Bernd Möbius
Dietrich Klakow
106
1
0
30 May 2025
MAVFlow: Preserving Paralinguistic Elements with Conditional Flow Matching for Zero-Shot AV2AV Multilingual Translation
Sungwoo Cho
J. Choi
Sungnyun Kim
Se-Young Yun
289
0
0
14 Mar 2025
Connecting Voices: LoReSpeech as a Low-Resource Speech Parallel Corpus
Samy Ouzerrout
AuLLM
156
0
0
25 Feb 2025
mWhisper-Flamingo for Multilingual Audio-Visual Noise-Robust Speech Recognition
IEEE Signal Processing Letters (IEEE SPL), 2025
Andrew Rouditchenko
Saurabhchand Bhati
Samuel Thomas
Hilde Kuehne
Rogerio Feris
483
1
0
03 Feb 2025
Towards Building Large Scale Datasets and State-of-the-Art Automatic Speech Translation Systems for 14 Indian Languages
Sparsh Jain
Ashwin Sankar
Devilal Choudhary
Dhairya Suman
Nikhil Narasimhan
Mohammed Safi Ur Rahman Khan
Anoop Kunchukuttan
Mitesh M. Khapra
Mary Dabre
415
3
0
07 Nov 2024
MOSEL: 950,000 Hours of Speech Data for Open-Source Speech Foundation Model Training on EU Languages
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Marco Gaido
Sara Papi
L. Bentivogli
Alessio Brutti
Mauro Cettolo
R. Gretter
M. Matassoni
Mohamed Nabih
Matteo Negri
190
12
0
01 Oct 2024
A Large Dataset of Spontaneous Speech with the Accent Spoken in São Paulo for Automatic Speech Recognition Evaluation
Brazilian Conference on Intelligent Systems (BRACIS), 2024
Rodrigo Lima
S. Leal
Arnaldo Candido Junior
S. Aluísio
129
2
0
10 Sep 2024
Speech-MASSIVE: A Multilingual Speech Dataset for SLU and Beyond
Interspeech (Interspeech), 2024
Beomseok Lee
Ioan Calapodescu
Marco Gaido
Matteo Negri
Laurent Besacier
AuLLM
220
16
0
07 Aug 2024
Tailored Design of Audio-Visual Speech Recognition Models using Branchformers
David Gimeno-Gómez
Carlos David Martínez Hinarejos
337
5
0
09 Jul 2024
Towards Robust Speech Representation Learning for Thousands of Languages
William Chen
Wangyou Zhang
Yifan Peng
Xinjian Li
Jinchuan Tian
Jiatong Shi
Xuankai Chang
Soumi Maiti
Karen Livescu
Shinji Watanabe
ELM
291
42
0
30 Jun 2024
MSR-86K: An Evolving, Multilingual Corpus with 86,300 Hours of Transcribed Audio for Speech Recognition Research
Song Li
Yongbin You
Xuezhi Wang
Zhengkun Tian
Ke Ding
Guanglu Wan
179
10
0
26 Jun 2024
FFSTC: Fongbe to French Speech Translation Corpus
International Conference on Language Resources and Evaluation (LREC), 2024
D. F. Kponou
F. Laleye
E. C. Ezin
164
2
0
08 Mar 2024
Where Visual Speech Meets Language: VSP-LLM Framework for Efficient and Context-Aware Visual Speech Processing
Jeong Hun Yeo
Seunghee Han
Minsu Kim
Y. Ro
260
31
0
23 Feb 2024
AnnoTheia: A Semi-Automatic Annotation Toolkit for Audio-Visual Speech Technologies
José-M. Acosta-Triana
David Gimeno-Gómez
Carlos David Martínez Hinarejos
VLM
VGen
265
2
0
20 Feb 2024
Speech Translation with Speech Foundation Models and Large Language Models: What is There and What is Missing?
Marco Gaido
Sara Papi
Matteo Negri
L. Bentivogli
404
26
0
19 Feb 2024
A Case Study on Filtering for End-to-End Speech Translation
Md Mahfuz Ibn Alam
Antonios Anastasopoulos
156
1
0
02 Feb 2024
Efficient Training for Multilingual Visual Speech Recognition: Pre-training with Discretized Visual Speech Representation
Minsu Kim
Jeong Hun Yeo
Se Jin Park
J. Choi
Y. Ro
225
7
0
18 Jan 2024
AV2AV: Direct Audio-Visual Speech to Audio-Visual Speech Translation with Unified Audio-Visual Speech Representation
Computer Vision and Pattern Recognition (CVPR), 2023
J. Choi
Se Jin Park
Minsu Kim
Y. Ro
323
16
0
05 Dec 2023
End-to-End Speech-to-Text Translation: A Survey
Nivedita Sethiya
Chandresh Kumar Maurya
456
13
0
02 Dec 2023
Speaker-Adapted End-to-End Visual Speech Recognition for Continuous Spanish
IberSPEECH Conference (IberSPEECH), 2022
David Gimeno-Gómez
Carlos David Martínez Hinarejos
131
0
0
21 Nov 2023
On-the-Fly Fusion of Large Language Models and Machine Translation
Hieu T. Hoang
Huda Khayrallah
Marcin Junczys-Dowmunt
234
4
0
14 Nov 2023
Automatic Disfluency Detection from Untranscribed Speech
IEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023
Amrit Romana
K. Koishida
E. Provost
205
16
0
01 Nov 2023
How To Build Competitive Multi-gender Speech Translation Models For Controlling Speaker Gender Translation
Marco Gaido
Dennis Fucci
Matteo Negri
L. Bentivogli
236
2
0
23 Oct 2023
Long-form Simultaneous Speech Translation: Thesis Proposal
International Joint Conference on Natural Language Processing (IJCNLP), 2023
Peter Polák
3DV
184
3
0
17 Oct 2023
Visual Speech Recognition for Languages with Limited Labeled Data using Automatic Labels from Whisper
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Jeong Hun Yeo
Minsu Kim
Shinji Watanabe
Y. Ro
VLM
197
16
0
15 Sep 2023
LeBenchmark 2.0: a Standardized, Replicable and Enhanced Framework for Self-supervised Representations of French Speech
Computer Speech and Language (CSL), 2023
Titouan Parcollet
H. Nguyen
Solène Evain
Marcely Zanon Boito
Adrien Pupier
...
François Portet
Solange Rossato
Fabien Ringeval
D. Schwab
Laurent Besacier
240
25
0
11 Sep 2023
Lip Reading for Low-resource Languages by Learning and Combining General Speech Knowledge and Language-specific Knowledge
IEEE International Conference on Computer Vision (ICCV), 2023
Minsu Kim
Jeong Hun Yeo
J. Choi
Y. Ro
164
27
0
18 Aug 2023
End-to-End Evaluation for Low-Latency Simultaneous Speech Translation
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Christian Huber
Tu Anh Dinh
Carlos Mullov
Ngoc-Quan Pham
Thai-Binh Nguyen
...
Danni Liu
Zhaolin Li
Sai Koneru
Jan Niehues
A. Waibel
211
10
0
07 Aug 2023
Many-to-Many Spoken Language Translation via Unified Speech and Text Representation Learning with Unit-to-Unit Translation
IEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023
Minsu Kim
J. Choi
Dahun Kim
Y. Ro
174
10
0
03 Aug 2023
Towards cross-language prosody transfer for dialog
Interspeech (Interspeech), 2023
Jonathan Avila
Nigel G. Ward
202
7
0
09 Jul 2023
HK-LegiCoST: Leveraging Non-Verbatim Transcripts for Speech Translation
Interspeech (Interspeech), 2023
Cihan Xiao
Lin Zhang
Jinyi Yang
Dongji Gao
Sanjeev Khudanpur
Kevin Duh
Sanjeev Khudanpur
208
2
0
20 Jun 2023
NAVER LABS Europe's Multilingual Speech Translation Systems for the IWSLT 2023 Low-Resource Track
International Workshop on Spoken Language Translation (IWSLT), 2023
Edward Gow-Smith
Alexandre Berard
Marcely Zanon Boito
Ioan Calapodescu
231
14
0
13 Jun 2023
The Interpreter Understands Your Meaning: End-to-end Spoken Language Understanding Aided by Speech Translation
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Mutian He
Philip N. Garner
319
5
0
16 May 2023
Learning Cross-lingual Visual Speech Representations
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Andreas Zinonos
A. Haliassos
Pingchuan Ma
Stavros Petridis
Maja Pantic
SSL
126
10
0
14 Mar 2023
MuAViC: A Multilingual Audio-Visual Corpus for Robust Speech Recognition and Robust Speech-to-Text Translation
Interspeech (Interspeech), 2023
Mohamed Anwar
Bowen Shi
Vedanuj Goswami
Wei-Ning Hsu
J. Pino
Changhan Wang
194
44
0
01 Mar 2023
Efficient CTC Regularization via Coarse Labels for End-to-End Speech Translation
Conference of the European Chapter of the Association for Computational Linguistics (EACL), 2023
Biao Zhang
Barry Haddow
Rico Sennrich
233
3
0
21 Feb 2023
SegAugment: Maximizing the Utility of Speech Translation Data with Segmentation-based Augmentations
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Ioannis Tsiamas
José A. R. Fonollosa
Marta R. Costa-jussá
238
6
0
19 Dec 2022
BLASER: A Text-Free Speech-to-Speech Translation Evaluation Metric
Annual Meeting of the Association for Computational Linguistics (ACL), 2022
Mingda Chen
Paul-Ambroise Duquenne
Pierre Yves Andrews
Justine T. Kao
Alexandre Mourachko
Holger Schwenk
Marta R. Costa-jussá
230
23
0
16 Dec 2022
UnitY: Two-pass Direct Speech-to-speech Translation with Discrete Units
Annual Meeting of the Association for Computational Linguistics (ACL), 2022
Hirofumi Inaguma
Sravya Popuri
Ilia Kulikov
Peng-Jen Chen
Changhan Wang
Yu-An Chung
Yun Tang
Ann Lee
Shinji Watanabe
J. Pino
280
75
0
15 Dec 2022
Dialogs Re-enacted Across Languages
Nigel G. Ward
Jonathan Avila
Emilia Rivas
Divette Marco
182
2
0
18 Nov 2022
SpeechMatrix: A Large-Scale Mined Corpus of Multilingual Speech-to-Speech Translations
Annual Meeting of the Association for Computational Linguistics (ACL), 2022
Paul-Ambroise Duquenne
Hongyu Gong
Ning Dong
Jingfei Du
Ann Lee
Vedanuj Goswani
Changhan Wang
J. Pino
Benoît Sagot
Holger Schwenk
240
44
0
08 Nov 2022
LAMASSU: Streaming Language-Agnostic Multilingual Speech Recognition and Translation Using Neural Transducers
Interspeech (Interspeech), 2022
Peidong Wang
Eric Sun
Jian Xue
Yu-Huan Wu
Long Zhou
Yashesh Gaur
Shujie Liu
Jinyu Li
328
10
0
05 Nov 2022
Improving Speech-to-Speech Translation Through Unlabeled Text
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Xuan-Phi Nguyen
Sravya Popuri
Changhan Wang
Yun Tang
Ilia Kulikov
Hongyu Gong
175
9
0
26 Oct 2022
Don't Discard Fixed-Window Audio Segmentation in Speech-to-Text Translation
Conference on Machine Translation (WMT), 2022
Chantal Amrhein
Barry Haddow
158
10
0
24 Oct 2022
Bringing NURC/SP to Digital Life: the Role of Open-source Automatic Speech Recognition Models
L. Gris
Arnaldo Cândido Júnior
V. G. Santos
B. Dias
Marli Quadros Leite
F. Svartman
S. Aluísio
124
3
0
14 Oct 2022
CTC Alignments Improve Autoregressive Translation
Conference of the European Chapter of the Association for Computational Linguistics (EACL), 2022
Brian Yan
Siddharth Dalmia
Yosuke Higuchi
Graham Neubig
Florian Metze
A. Black
Shinji Watanabe
173
36
0
11 Oct 2022
1
2
Next