Speech Translation and the End-to-End Promise: Taking Stock of Where We Are

Annual Meeting of the Association for Computational Linguistics (ACL), 2020

14 April 2020

Matthias Sperber

Matthias Paulik

ArXiv (abs)PDF HTML

Papers citing "Speech Translation and the End-to-End Promise: Taking Stock of Where We Are"

50 / 73 papers shown

MCAT: Scaling Many-to-Many Speech-to-Text Translation with MLLMs to 70 Languages

171

01 Dec 2025

V-SAT: Video Subtitle Annotation Tool

114

28 Oct 2025

Listening or Reading? Evaluating Speech Awareness in Chain-of-Thought Speech-to-Text Translation

Cristina España-Bonet

LRM

137

03 Oct 2025

Vision-Grounded Machine Interpreting: Improving the Translation Process through Visual Cues

Claudio Fantinuoli

202

28 Sep 2025

Toward Machine Interpreting: Lessons from Human Interpreting Studies

193

11 Aug 2025

PHRASED: Phrase Dictionary Biasing for Speech Translation

Aswin Shanmugam Subramanian

Jinyu Li

229

10 Jun 2025

Speech-to-Speech Translation Pipelines for Conversations in Low-Resource Languages

228

02 Jun 2025

Different Speech Translation Models Encode and Translate Speaker Gender DifferentlyAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

317

02 Jun 2025

Spatial Speech Translation: Translating Across Space With Binaural HearablesInternational Conference on Human Factors in Computing Systems (CHI), 2025

241

25 Apr 2025

DoCIA: An Online Document-Level Context Incorporation Agent for Speech TranslationAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

...

323

07 Apr 2025

Joint Training And Decoding for Multilingual End-to-End Simultaneous Speech TranslationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

351

14 Mar 2025

Speech Translation Refinement using Large Language Models

1.0K

28 Jan 2025

Prepending or Cross-Attention for Speech-to-Text? An Empirical ComparisonNorth American Chapter of the Association for Computational Linguistics (NAACL), 2025

487

04 Jan 2025

Towards Building Large Scale Datasets and State-of-the-Art Automatic Speech Translation Systems for 14 Indian Languages

Mohammed Safi Ur Rahman Khan

Anoop Kunchukuttan

Mitesh M. Khapra

Mary Dabre

571

07 Nov 2024

Speech is More Than Words: Do Speech-to-Text Translation Systems Leverage Prosody?Conference on Machine Translation (WMT), 2024

195

31 Oct 2024

CTC-GMM: CTC guided modality matching for fast and accurate streaming speech translationSpoken Language Technology Workshop (SLT), 2024

Rui Zhao

Jinyu Li

Ruchao Fan

Matt Post

211

07 Oct 2024

Optimizing Rare Word Accuracy in Direct Speech Translation with a Retrieval-and-Demonstration ApproachConference on Empirical Methods in Natural Language Processing (EMNLP), 2024

Siqi Li

Danni Liu

Jan Niehues

309

13 Sep 2024

Lightweight Audio Segmentation for Long-form Speech TranslationInterspeech (Interspeech), 2024

221

15 Jun 2024

Soft Language Identification for Language-Agnostic Many-to-One End-to-End Speech Translation

Peidong Wang

Jian Xue

Jinyu Li

Junkun Chen

Aswin Shanmugam Subramanian

265

12 Jun 2024

TransVIP: Speech to Speech Translation System with Voice and Isochrony Preservation

...

385

28 May 2024

SBAAM! Eliminating Transcript Dependency in Automatic SubtitlingAnnual Meeting of the Association for Computational Linguistics (ACL), 2024

262

17 May 2024

Speech Translation with Speech Foundation Models and Large Language Models: What is There and What is Missing?

518

19 Feb 2024

Pushing the Limits of Zero-shot End-to-End Speech Translation

362

16 Feb 2024

A Case Study on Filtering for End-to-End Speech Translation

Md Mahfuz Ibn Alam

Antonios Anastasopoulos

230

02 Feb 2024

Prosody in Cascade and Direct Speech-to-Text Translation: a case study on Korean Wh-Phrases

196

01 Feb 2024

Towards a Deep Understanding of Multilingual End-to-End Speech TranslationConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

Yikun Lei

238

31 Oct 2023

Long-form Simultaneous Speech Translation: Thesis ProposalInternational Joint Conference on Natural Language Processing (IJCNLP), 2023

Peter Polák

3DV

282

17 Oct 2023

Long-Form End-to-End Speech Translation via Latent Alignment SegmentationSpoken Language Technology Workshop (SLT), 2023

Peter Polák

Ondrej Bojar

296

20 Sep 2023

DiariST: Streaming Speech Translation with Speaker DiarizationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

292

14 Sep 2023

On decoder-only architecture for speech-to-text and large language model integrationAutomatic Speech Recognition & Understanding (ASRU), 2023

...

649

204

08 Jul 2023

Recent Advances in Direct Speech-to-text TranslationInternational Joint Conference on Artificial Intelligence (IJCAI), 2023

Jingbo Zhu

377

20 Jun 2023

Speech Translation with Foundation Models and Optimal Transport: UPC at IWSLT23International Workshop on Spoken Language Translation (IWSLT), 2023

269

02 Jun 2023

Robustness of Multi-Source MT to Transcription ErrorsAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

261

26 May 2023

ComSL: A Composite Speech-Language Model for End-to-End Speech-to-Text TranslationNeural Information Processing Systems (NeurIPS), 2023

338

24 May 2023

Understanding and Bridging the Modality Gap for Speech TranslationAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

Qingkai Fang

Yang Feng

361

15 May 2023

Selective Data Augmentation for Robust Speech Translation

R. Acharya

Ashish Panda

Sunil Kumar Kopparapu

148

22 Mar 2023

SegAugment: Maximizing the Utility of Speech Translation Data with Segmentation-based AugmentationsConference on Empirical Methods in Natural Language Processing (EMNLP), 2022

Ioannis Tsiamas

José A. R. Fonollosa

Marta R. Costa-jussá

351

19 Dec 2022

A Weakly-Supervised Streaming Multilingual Speech Model with Truly Zero-Shot CapabilityAutomatic Speech Recognition & Understanding (ASRU), 2022

271

04 Nov 2022

Efficient Speech Translation with Dynamic Latent PerceiversIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

273

28 Oct 2022

Does Joint Training Really Help Cascaded Speech Translation?Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022

331

24 Oct 2022

Towards Relation Extraction From SpeechConference on Empirical Methods in Natural Language Processing (EMNLP), 2022

251

17 Oct 2022

Generating Synthetic Speech from SpokenVocab for Speech TranslationFindings (Findings), 2022

Jinming Zhao

Gholamreza Haffar

Ehsan Shareghi

224

15 Oct 2022

Direct Speech Translation for Automatic SubtitlingTransactions of the Association for Computational Linguistics (TACL), 2022

276

27 Sep 2022

A Comprehensive Survey of Natural Language Generation Advances from the Perspective of Digital Deception

263

11 Aug 2022

A High-Quality and Large-Scale Dataset for English-Vietnamese Speech TranslationInterspeech (Interspeech), 2022

203

08 Aug 2022

M-Adapter: Modality Adaptation for End-to-End Speech-to-Text TranslationInterspeech (Interspeech), 2022

Jinming Zhao

Haomiao Yang

Ehsan Shareghi

Gholamreza Haffari

263

03 Jul 2022

Multiformer: A Head-Configurable Transformer-Based Model for Direct Speech TranslationNorth American Chapter of the Association for Computational Linguistics (NAACL), 2022

219

14 May 2022

Joint Generation of Captions and Subtitles with Dual DecodingInternational Workshop on Spoken Language Translation (IWSLT), 2022

184

13 May 2022

LibriS2S: A German-English Speech-to-Speech Translation CorpusInternational Conference on Language Resources and Evaluation (LREC), 2022

Pedro Jeuris

Jan Niehues

AuLLM

197

22 Apr 2022

Large-Scale Streaming End-to-End Speech Translation with Neural TransducersInterspeech (Interspeech), 2022

331

11 Apr 2022