v1v2 (latest)

The Multilingual TEDx Corpus for Speech Recognition and Translation

Interspeech (Interspeech), 2021

2 February 2021

Papers citing "The Multilingual TEDx Corpus for Speech Recognition and Translation"

50 / 76 papers shown

Fast Neural Tangent Kernel Alignment, Norm and Effective Rank via Trace Estimation

James Hazelden

131

13 Nov 2025

Whisper-UT: A Unified Translation Framework for Speech and Text

Cihan Xiao

Matthew Wiesner

Debashish Chakraborty

130

19 Sep 2025

SENSE models: an open source solution for multilingual and multimodal semantic-based tasks

200

15 Sep 2025

NTU Speechlab LLM-Based Multilingual ASR System for Interspeech MLC-SLM Challenge 2025

330

16 Jun 2025

Voice Conversion Improves Cross-Domain Robustness for Spoken Arabic Dialect Identification

178

30 May 2025

MAVFlow: Preserving Paralinguistic Elements with Conditional Flow Matching for Zero-Shot AV2AV Multilingual Translation

411

14 Mar 2025

Connecting Voices: LoReSpeech as a Low-Resource Speech Parallel Corpus

Samy Ouzerrout

AuLLM

228

25 Feb 2025

mWhisper-Flamingo for Multilingual Audio-Visual Noise-Robust Speech RecognitionIEEE Signal Processing Letters (IEEE SPL), 2025

615

03 Feb 2025

Towards Building Large Scale Datasets and State-of-the-Art Automatic Speech Translation Systems for 14 Indian Languages

Mohammed Safi Ur Rahman Khan

Anoop Kunchukuttan

Mitesh M. Khapra

Mary Dabre

578

07 Nov 2024

MOSEL: 950,000 Hours of Speech Data for Open-Source Speech Foundation Model Training on EU LanguagesConference on Empirical Methods in Natural Language Processing (EMNLP), 2024

295

01 Oct 2024

A Large Dataset of Spontaneous Speech with the Accent Spoken in São Paulo for Automatic Speech Recognition EvaluationBrazilian Conference on Intelligent Systems (BRACIS), 2024

Rodrigo Lima

S. Leal

Arnaldo Candido Junior

S. Aluísio

227

10 Sep 2024

Speech-MASSIVE: A Multilingual Speech Dataset for SLU and BeyondInterspeech (Interspeech), 2024

351

07 Aug 2024

Tailored Design of Audio-Visual Speech Recognition Models using Branchformers

David Gimeno-Gómez

Carlos David Martínez Hinarejos

520

09 Jul 2024

Towards Robust Speech Representation Learning for Thousands of Languages

William Chen

Wangyou Zhang

Yifan Peng

Xinjian Li

Jinchuan Tian

Jiatong Shi

Xuankai Chang

Soumi Maiti

Karen Livescu

Shinji Watanabe

ELM

437

30 Jun 2024

MSR-86K: An Evolving, Multilingual Corpus with 86,300 Hours of Transcribed Audio for Speech Recognition Research

Ke Ding

Guanglu Wan

252

26 Jun 2024

FFSTC: Fongbe to French Speech Translation CorpusInternational Conference on Language Resources and Evaluation (LREC), 2024

D. F. Kponou

F. Laleye

E. C. Ezin

251

08 Mar 2024

Where Visual Speech Meets Language: VSP-LLM Framework for Efficient and Context-Aware Visual Speech Processing

Jeong Hun Yeo

Seunghee Han

Minsu Kim

Y. Ro

379

23 Feb 2024

AnnoTheia: A Semi-Automatic Annotation Toolkit for Audio-Visual Speech Technologies

José-M. Acosta-Triana

David Gimeno-Gómez

Carlos David Martínez Hinarejos

VLM VGen

352

20 Feb 2024

Speech Translation with Speech Foundation Models and Large Language Models: What is There and What is Missing?

524

19 Feb 2024

A Case Study on Filtering for End-to-End Speech Translation

Md Mahfuz Ibn Alam

Antonios Anastasopoulos

237

02 Feb 2024

Efficient Training for Multilingual Visual Speech Recognition: Pre-training with Discretized Visual Speech Representation

Jeong Hun Yeo

340

18 Jan 2024

AV2AV: Direct Audio-Visual Speech to Audio-Visual Speech Translation with Unified Audio-Visual Speech RepresentationComputer Vision and Pattern Recognition (CVPR), 2023

469

05 Dec 2023

End-to-End Speech-to-Text Translation: A Survey

Nivedita Sethiya

Chandresh Kumar Maurya

585

02 Dec 2023

Speaker-Adapted End-to-End Visual Speech Recognition for Continuous SpanishIberSPEECH Conference (IberSPEECH), 2022

David Gimeno-Gómez

Carlos David Martínez Hinarejos

276

21 Nov 2023

On-the-Fly Fusion of Large Language Models and Machine Translation

Hieu T. Hoang

Huda Khayrallah

Marcin Junczys-Dowmunt

329

14 Nov 2023

Automatic Disfluency Detection from Untranscribed SpeechIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023

Amrit Romana

K. Koishida

E. Provost

334

01 Nov 2023

How To Build Competitive Multi-gender Speech Translation Models For Controlling Speaker Gender Translation

357

23 Oct 2023

Long-form Simultaneous Speech Translation: Thesis ProposalInternational Joint Conference on Natural Language Processing (IJCNLP), 2023

Peter Polák

3DV

287

17 Oct 2023

Visual Speech Recognition for Languages with Limited Labeled Data using Automatic Labels from WhisperIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

Jeong Hun Yeo

328

15 Sep 2023

LeBenchmark 2.0: a Standardized, Replicable and Enhanced Framework for Self-supervised Representations of French SpeechComputer Speech and Language (CSL), 2023

...

307

11 Sep 2023

Lip Reading for Low-resource Languages by Learning and Combining General Speech Knowledge and Language-specific KnowledgeIEEE International Conference on Computer Vision (ICCV), 2023

Minsu Kim

Jeong Hun Yeo

J. Choi

Y. Ro

288

18 Aug 2023

End-to-End Evaluation for Low-Latency Simultaneous Speech TranslationConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

...

327

07 Aug 2023

Many-to-Many Spoken Language Translation via Unified Speech and Text Representation Learning with Unit-to-Unit TranslationIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023

277

03 Aug 2023

Towards cross-language prosody transfer for dialogInterspeech (Interspeech), 2023

Jonathan Avila

Nigel G. Ward

333

09 Jul 2023

HK-LegiCoST: Leveraging Non-Verbatim Transcripts for Speech TranslationInterspeech (Interspeech), 2023

Sanjeev Khudanpur

269

20 Jun 2023

NAVER LABS Europe's Multilingual Speech Translation Systems for the IWSLT 2023 Low-Resource TrackInternational Workshop on Spoken Language Translation (IWSLT), 2023

307

13 Jun 2023

The Interpreter Understands Your Meaning: End-to-end Spoken Language Understanding Aided by Speech TranslationConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

Mutian He

Philip N. Garner

416

16 May 2023

Learning Cross-lingual Visual Speech RepresentationsIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

198

14 Mar 2023

MuAViC: A Multilingual Audio-Visual Corpus for Robust Speech Recognition and Robust Speech-to-Text TranslationInterspeech (Interspeech), 2023

274

01 Mar 2023

Efficient CTC Regularization via Coarse Labels for End-to-End Speech TranslationConference of the European Chapter of the Association for Computational Linguistics (EACL), 2023

Biao Zhang

Barry Haddow

Rico Sennrich

385

21 Feb 2023

SegAugment: Maximizing the Utility of Speech Translation Data with Segmentation-based AugmentationsConference on Empirical Methods in Natural Language Processing (EMNLP), 2022

Ioannis Tsiamas

José A. R. Fonollosa

Marta R. Costa-jussá

351

19 Dec 2022

BLASER: A Text-Free Speech-to-Speech Translation Evaluation MetricAnnual Meeting of the Association for Computational Linguistics (ACL), 2022

Mingda Chen

Paul-Ambroise Duquenne

282

16 Dec 2022

UnitY: Two-pass Direct Speech-to-speech Translation with Discrete UnitsAnnual Meeting of the Association for Computational Linguistics (ACL), 2022

403

15 Dec 2022

Dialogs Re-enacted Across Languages

246

18 Nov 2022

SpeechMatrix: A Large-Scale Mined Corpus of Multilingual Speech-to-Speech TranslationsAnnual Meeting of the Association for Computational Linguistics (ACL), 2022

Paul-Ambroise Duquenne

306

08 Nov 2022

LAMASSU: Streaming Language-Agnostic Multilingual Speech Recognition and Translation Using Neural TransducersInterspeech (Interspeech), 2022

446

05 Nov 2022

Improving Speech-to-Speech Translation Through Unlabeled TextIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

226

26 Oct 2022

Don't Discard Fixed-Window Audio Segmentation in Speech-to-Text TranslationConference on Machine Translation (WMT), 2022

Chantal Amrhein

Barry Haddow

327

24 Oct 2022

Bringing NURC/SP to Digital Life: the Role of Open-source Automatic Speech Recognition Models

L. Gris

Arnaldo Cândido Júnior

205

14 Oct 2022

CTC Alignments Improve Autoregressive TranslationConference of the European Chapter of the Association for Computational Linguistics (EACL), 2022

Graham Neubig

217

11 Oct 2022