v1v2 (latest)

A General Multi-Task Learning Framework to Leverage Text Data for Speech to Text Tasks

21 October 2020

Papers citing "A General Multi-Task Learning Framework to Leverage Text Data for Speech to Text Tasks"

49 / 49 papers shown

Improving Language and Modality Transfer in Translation by Character-level ModelingAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

Ioannis Tsiamas

David Dale

Marta R. Costa-jussá

136

30 May 2025

LLaST: Improved End-to-end Speech Translation System Leveraged by Large Language Models

304

22 Jul 2024

An Adapter-Based Unified Model for Multiple Spoken Language Processing Tasks

Varsha Suresh

Salah Ait-Mokhtar

Caroline Brun

Ioan Calapodescu

173

20 Jun 2024

CoSTA: Code-Switched Speech Translation using Aligned Speech-Text Interleaving

Bhavani Shankar

Preethi Jyothi

Pushpak Bhattacharyya

306

16 Jun 2024

StreamSpeech: Simultaneous Speech-to-Speech Translation with Multi-task Learning

Shaolei Zhang

Yang Feng

257

05 Jun 2024

Automatic Speech Recognition using Advanced Deep Learning Approaches: A survey

Hamza Kheddar

Mustapha Hemis

Yassine Himeur

OffRL

258

138

02 Mar 2024

Pushing the Limits of Zero-shot End-to-End Speech Translation

264

16 Feb 2024

Rethinking and Improving Multi-task Learning for End-to-end Speech TranslationConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

Jingbo Zhu

193

07 Nov 2023

Audio-AdapterFusion: A Task-ID-free Approach for Efficient and Non-Destructive Multi-task Speech RecognitionAutomatic Speech Recognition & Understanding (ASRU), 2023

222

17 Oct 2023

An Empirical Study of Consistency Regularization for End-to-End Speech-to-Text TranslationNorth American Chapter of the Association for Computational Linguistics (NAACL), 2023

184

28 Aug 2023

Improving Joint Speech-Text Representations Without AlignmentInterspeech (Interspeech), 2023

Andrew Rosenberg

213

11 Aug 2023

Improving End-to-End Speech Translation by Imitation-Based Knowledge Distillation with Synthetic TranscriptsInternational Workshop on Spoken Language Translation (IWSLT), 2023

Rebekka Hubert

Artem Sokolov

Stefan Riezler

214

17 Jul 2023

Performance Comparison of Pre-trained Models for Speech-to-Text in Turkish: Whisper-Small and Wav2Vec2-XLS-R-300MTürkiye Bilişim Vakfı Bilgisayar Bilimleri ve Mühendisliği dergisi (TBBMD), 2023

109

06 Jul 2023

Recent Advances in Direct Speech-to-text TranslationInternational Joint Conference on Artificial Intelligence (IJCAI), 2023

Jingbo Zhu

270

20 Jun 2023

End-to-End Simultaneous Speech Translation with Differentiable SegmentationAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

Shaolei Zhang

Yang Feng

254

25 May 2023

CMOT: Cross-modal Mixup via Optimal Transport for Speech TranslationAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

Yan Zhou

Qingkai Fang

Yang Feng

328

24 May 2023

Improving speech translation by fusing speech and textConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

204

23 May 2023

Back Translation for Speech-to-text Translation Without TranscriptsAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

Qingkai Fang

Yang Feng

204

15 May 2023

Understanding and Bridging the Modality Gap for Speech TranslationAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

Qingkai Fang

Yang Feng

241

15 May 2023

Hybrid Transducer and Attention based Encoder-Decoder Modeling for Speech-to-Text TasksAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

258

04 May 2023

Deep Transfer Learning for Automatic Speech Recognition: Towards Better GeneralizationKnowledge-Based Systems (KBS), 2023

Hamza Kheddar

290

117

27 Apr 2023

MixSpeech: Cross-Modality Self-Learning with Audio-Visual Stream Mixup for Visual Speech Translation and RecognitionIEEE International Conference on Computer Vision (ICCV), 2023

Xize Cheng

Rongjie Huang

Zhou Zhao

205

09 Mar 2023

Multi-task Highly Adaptive Lasso

224

27 Jan 2023

Pre-training for Speech Translation: CTC Meets Optimal TransportInternational Conference on Machine Learning (ICML), 2023

352

27 Jan 2023

$Mu$^{2}$SLAM: Multitask, Multilingual Speech and Language Models$

^{2}

SLAM: Multitask, Multilingual Speech and Language ModelsInternational Conference on Machine Learning (ICML), 2022

155

19 Dec 2022

WACO: Word-Aligned Contrastive Learning for Speech TranslationAnnual Meeting of the Association for Computational Linguistics (ACL), 2022

Siqi Ouyang

Rong Ye

Lei Li

321

19 Dec 2022

AdaTranS: Adapting with Boundary-based Shrinking for End-to-End Speech TranslationConference on Empirical Methods in Natural Language Processing (EMNLP), 2022

Xingshan Zeng

Liangyou Li

Qun Liu

149

17 Dec 2022

Improving End-to-end Speech Translation by Leveraging Auxiliary Speech and Text DataAAAI Conference on Artificial Intelligence (AAAI), 2022

Jingbo Zhu

170

04 Dec 2022

MMSpeech: Multi-modal Multi-task Encoder-Decoder Pre-training for Speech RecognitionInterspeech (Interspeech), 2022

Xiaohuan Zhou

Jiaming Wang

Zeyu Cui

Shiliang Zhang

Zhijie Yan

Jingren Zhou

Chang Zhou

228

29 Nov 2022

T5lephone: Bridging Speech and Text Self-supervised Models for Spoken Language Understanding via Phoneme level T5IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

291

01 Nov 2022

Speech-text based multi-modal training with bidirectional attention for improved speech recognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

Sheng Li

176

01 Nov 2022

Don't Discard Fixed-Window Audio Segmentation in Speech-to-Text TranslationConference on Machine Translation (WMT), 2022

Chantal Amrhein

Barry Haddow

186

24 Oct 2022

Discrete Cross-Modal Alignment Enables Zero-Shot Speech TranslationConference on Empirical Methods in Natural Language Processing (EMNLP), 2022

191

18 Oct 2022

Efficient acoustic feature transformation in mismatched environments using a Guided-GANSpeech Communication (Speech Commun.), 2022

Walter Heymans

Marelie Hattingh Davel

C. van Heerden

229

03 Oct 2022

Improving Deliberation by Text-Only and Semi-Supervised TrainingInterspeech (Interspeech), 2022

227

29 Jun 2022

Cross-modal Contrastive Learning for Speech TranslationNorth American Chapter of the Association for Computational Linguistics (NAACL), 2022

Rong Ye

Mingxuan Wang

Lei Li

SSL

241

102

05 May 2022

Hear No Evil: Towards Adversarial Robustness of Automatic Speech Recognition via Multi-Task LearningInterspeech (Interspeech), 2022

Nilaksh Das

Duen Horng Chau

AAML

147

05 Apr 2022

STEMM: Self-learning with Speech-text Manifold Mixup for Speech TranslationAnnual Meeting of the Association for Computational Linguistics (ACL), 2022

Lei Li

288

108

20 Mar 2022

Tackling data scarcity in speech translation using zero-shot multilingual machine translation techniquesIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

Tu Anh Dinh

Danni Liu

Jan Niehues

139

26 Jan 2022

Optimizing Alignment of Speech and Language Latent Spaces for End-to-End Speech Recognition and UnderstandingIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021

159

23 Oct 2021

SpeechT5: Unified-Modal Encoder-Decoder Pre-Training for Spoken Language Processing

Rui Wang

...

346

250

14 Oct 2021

ASR Rescoring and Confidence Estimation with ELECTRA

202

05 Oct 2021

FST: the FAIR Speech Translation System for the IWSLT21 Multilingual Shared TaskInternational Workshop on Spoken Language Translation (IWSLT), 2021

Xian Li

170

14 Jul 2021

Zero-shot Speech Translation

Tu Anh Dinh

156

13 Jul 2021

Improving Speech Translation by Understanding and Learning from the Auxiliary Text Translation Task

Xian Li

258

12 Jul 2021

The Volctrans Neural Speech Translation System for IWSLT 2021International Workshop on Spoken Language Translation (IWSLT), 2021

Lei Li

203

16 May 2021

Learning Shared Semantic Space for Speech-to-Text TranslationFindings (Findings), 2021

Chi Han

Mingxuan Wang

Heng Ji

Lei Li

417

07 May 2021

End-to-end Speech Translation via Cross-modal Progressive TrainingInterspeech (Interspeech), 2021

Rong Ye

Mingxuan Wang

Lei Li

220

21 Apr 2021

Large-Scale Self- and Semi-Supervised Learning for Speech TranslationInterspeech (Interspeech), 2021

217

14 Apr 2021