v1v2 (latest)

CoVoST: A Diverse Multilingual Speech-To-Text Translation Corpus

International Conference on Language Resources and Evaluation (LREC), 2020

4 February 2020

Papers citing "CoVoST: A Diverse Multilingual Speech-To-Text Translation Corpus"

50 / 63 papers shown

Ming-UniAudio: Speech LLM for Joint Understanding, Generation and Editing with Unified Representation

...

446

26 Oct 2025

InteractiveOmni: A Unified Omni-modal Model for Audio-Visual Multi-turn Dialogue

...

474

15 Oct 2025

CS3-Bench: Evaluating and Enhancing Speech-to-Speech LLMs for Mandarin-English Code-Switching

127

09 Oct 2025

SimulSense: Sense-Driven Interpreting for Efficient Simultaneous Speech Translation

Haotian Tan

Hiroki Ouchi

S. Sakti

161

26 Sep 2025

PART: Progressive Alignment Representation Training for Multilingual Speech-To-Text with LLMs

136

24 Sep 2025

Whisper-UT: A Unified Translation Framework for Speech and Text

Cihan Xiao

Matthew Wiesner

Debashish Chakraborty

117

19 Sep 2025

Self-Improvement for Audio Large Language Model using Unlabeled Speech

211

27 Jul 2025

SeqPO-SiMT: Sequential Policy Optimization for Simultaneous Machine TranslationAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

408

27 May 2025

Reshaping Representation Space to Balance the Safety and Over-rejection in Large Audio Language Models

234

26 May 2025

Miipher-2: A Universal Speech Restoration Model for Million-Hour Scale Data Restoration

1.1K

07 May 2025

Mind the Gap! Static and Interactive Evaluations of Large Audio Models

245

21 Feb 2025

Cross-lingual Embedding Clustering for Hierarchical Softmax in Low-Resource Multilingual Speech RecognitionIEEE Transactions on Audio, Speech, and Language Processing (TASLP), 2025

188

29 Jan 2025

SPES: Spectrogram Perturbation for Explainable Speech-to-Text Generation

631

03 Nov 2024

Towards Achieving Human Parity on End-to-end Simultaneous Speech Translation via LLM Agent

366

31 Jul 2024

NAIST Simultaneous Speech Translation System for IWSLT 2024

...

355

30 Jun 2024

LeaPformer: Enabling Linear Transformers for Autoregressive and Simultaneous Tasks via Learned ProportionsInternational Conference on Machine Learning (ICML), 2024

317

18 May 2024

FFSTC: Fongbe to French Speech Translation CorpusInternational Conference on Language Resources and Evaluation (LREC), 2024

D. F. Kponou

F. Laleye

E. C. Ezin

230

08 Mar 2024

AIR-Bench: Benchmarking Large Audio-Language Models via Generative Comprehension

Jin Xu

Yunfei Chu

...

Chang Zhou

Jingren Zhou

LM&MA AuLLM ALM

271

196

12 Feb 2024

An Experimental Study: Assessing the Combined Framework of WavLM and BEST-RQ for Text-to-Speech Synthesis

Via Nielson

Steven Hillis

118

08 Dec 2023

End-to-End Speech-to-Text Translation: A Survey

Nivedita Sethiya

Chandresh Kumar Maurya

557

02 Dec 2023

End-to-End Single-Channel Speaker-Turn Aware Conversational Speech TranslationConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

267

01 Nov 2023

Sparks of Large Audio Models: A Survey and Outlook

...

Björn W. Schuller

807

24 Aug 2023

Recent Advances in Direct Speech-to-text TranslationInternational Joint Conference on Artificial Intelligence (IJCAI), 2023

Jingbo Zhu

372

20 Jun 2023

Tagged End-to-End Simultaneous Speech Translation Training using Simultaneous Interpretation DataInternational Workshop on Spoken Language Translation (IWSLT), 2023

217

14 Jun 2023

Improved Cross-Lingual Transfer Learning For Automatic Speech Translation

402

01 Jun 2023

BIG-C: a Multimodal Multi-Purpose Dataset for BembaAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

Claytone Sikasote

Eunice Mukonde

Md Mahfuz Ibn Alam

Antonios Anastasopoulos

229

26 May 2023

Inter-connection: Effective Connection between Pre-trained Encoder and Decoder for Speech TranslationInterspeech (Interspeech), 2023

Yuta Nishikawa

Satoshi Nakamura

166

26 May 2023

MuAViC: A Multilingual Audio-Visual Corpus for Robust Speech Recognition and Robust Speech-to-Text TranslationInterspeech (Interspeech), 2023

256

01 Mar 2023

Jointly Optimizing Translations and Speech Timing to Improve Isochrony in Automatic Dubbing

Alexandra Chronopoulou

194

25 Feb 2023

Pre-training for Speech Translation: CTC Meets Optimal TransportInternational Conference on Machine Learning (ICML), 2023

416

27 Jan 2023

SegAugment: Maximizing the Utility of Speech Translation Data with Segmentation-based AugmentationsConference on Empirical Methods in Natural Language Processing (EMNLP), 2022

Ioannis Tsiamas

José A. R. Fonollosa

Marta R. Costa-jussá

340

19 Dec 2022

End-to-End Speech Translation of Arabic to English Broadcast NewsWorkshop on Arabic Natural Language Processing (WANLP), 2022

Fethi Bougares

Salim Jouili

140

11 Dec 2022

SpeechMatrix: A Large-Scale Mined Corpus of Multilingual Speech-to-Speech TranslationsAnnual Meeting of the Association for Computational Linguistics (ACL), 2022

Paul-Ambroise Duquenne

296

08 Nov 2022

Make More of Your Data: Minimal Effort Data Augmentation for Automatic Speech Recognition and TranslationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

339

27 Oct 2022

Discrete Cross-Modal Alignment Enables Zero-Shot Speech TranslationConference on Empirical Methods in Natural Language Processing (EMNLP), 2022

285

18 Oct 2022

Bringing NURC/SP to Digital Life: the Role of Open-source Automatic Speech Recognition Models

L. Gris

Arnaldo Cândido Júnior

191

14 Oct 2022

A High-Quality and Large-Scale Dataset for English-Vietnamese Speech TranslationInterspeech (Interspeech), 2022

188

08 Aug 2022

T-Modules: Translation Modules for Zero-Shot Cross-Modal Machine TranslationConference on Empirical Methods in Natural Language Processing (EMNLP), 2022

Paul-Ambroise Duquenne

Hongyu Gong

Benoît Sagot

Holger Schwenk

258

24 May 2022

Non-Parametric Domain Adaptation for End-to-End Speech TranslationConference on Empirical Methods in Natural Language Processing (EMNLP), 2022

559

23 May 2022

SAMU-XLSR: Semantically-Aligned Multimodal Utterance-level Cross-Lingual Speech RepresentationIEEE Journal on Selected Topics in Signal Processing (IEEE JSTSP), 2022

Sameer Khurana

Antoine Laurent

James R. Glass

207

17 May 2022

Efficient yet Competitive Speech Translation: FBK@IWSLT2022International Workshop on Spoken Language Translation (IWSLT), 2022

192

05 May 2022

End-to-End Speech Translation for Code Switched SpeechFindings (Findings), 2022

302

11 Apr 2022

Hierarchical Softmax for End-to-End Low-resource Multilingual Speech RecognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

Qianying Liu

Zhuo Gong

Zhengdong Yang

Yuhang Yang

Sheng Li

...

Sadao Kurohashi

227

08 Apr 2022

An Analysis of Semantically-Aligned Speech-Text EmbeddingsSpoken Language Technology Workshop (SLT), 2022

M. Huzaifah

Ivan Kukanov

254

04 Apr 2022

Prabhupadavani: A Code-mixed Speech Translation Data for 25 Languages

190

27 Jan 2022

CVSS Corpus and Massively Multilingual Speech-to-Speech TranslationInternational Conference on Language Resources and Evaluation (LREC), 2022

Yeting Jia

Michelle Tadmor Ramanovich

Quan Wang

Heiga Zen

SLR

349

11 Jan 2022

CORAA: a large corpus of spontaneous and prepared speech manually validated for speech recognition in Brazilian Portuguese

Arnaldo Cândido Júnior

...

Daniel Peixoto Pinto da Silva

Fernando Gorgulho Fayet

B. Carlotto

L. Gris

S. Aluísio

246

14 Oct 2021

Is "moby dick" a Whale or a Bird? Named Entities and Terminology in Speech Translation

114

15 Sep 2021

The HW-TSC's Offline Speech Translation Systems for IWSLT 2021 Evaluation

Jiaxin Guo

...

Xingshan Zeng

116

09 Aug 2021

FST: the FAIR Speech Translation System for the IWSLT21 Multilingual Shared TaskInternational Workshop on Spoken Language Translation (IWSLT), 2021

Xian Li

238

14 Jul 2021