v1v2 (latest)

A General Multi-Task Learning Framework to Leverage Text Data for Speech to Text Tasks

21 October 2020

Papers citing "A General Multi-Task Learning Framework to Leverage Text Data for Speech to Text Tasks"

49 / 49 papers shown

Title
Improving Language and Modality Transfer in Translation by Character-level ModelingAnnual Meeting of the Association for Computational Linguistics (ACL), 2025 Ioannis Tsiamas David Dale Marta R. Costa-jussá 100 3 0 30 May 2025
LLaST: Improved End-to-end Speech Translation System Leveraged by Large Language Models Xi Chen Songyang Zhang Qibing Bai Kai-xiang Chen Satoshi Nakamura AuLLM 268 17 0 22 Jul 2024
An Adapter-Based Unified Model for Multiple Spoken Language Processing Tasks Varsha Suresh Salah Ait-Mokhtar Caroline Brun Ioan Calapodescu 141 1 0 20 Jun 2024
CoSTA: Code-Switched Speech Translation using Aligned Speech-Text Interleaving Bhavani Shankar Preethi Jyothi Pushpak Bhattacharyya 246 4 0 16 Jun 2024
StreamSpeech: Simultaneous Speech-to-Speech Translation with Multi-task Learning Shaolei Zhang Qingkai Fang Shoutao Guo Zhengrui Ma Min Zhang Yang Feng 221 19 0 05 Jun 2024
Automatic Speech Recognition using Advanced Deep Learning Approaches: A survey Hamza Kheddar Mustapha Hemis Yassine Himeur OffRL 194 131 0 02 Mar 2024
Pushing the Limits of Zero-shot End-to-End Speech Translation Ioannis Tsiamas Gerard I. Gállego José A. R. Fonollosa Marta R. Costa-jussá 240 15 0 16 Feb 2024
Rethinking and Improving Multi-task Learning for End-to-end Speech TranslationConference on Empirical Methods in Natural Language Processing (EMNLP), 2023 Yuhao Zhang Chen Xu Bei Li Hao Chen Tong Xiao Chunliang Zhang Jingbo Zhu 165 9 0 07 Nov 2023
Audio-AdapterFusion: A Task-ID-free Approach for Efficient and Non-Destructive Multi-task Speech RecognitionAutomatic Speech Recognition & Understanding (ASRU), 2023 Hillary Ngai Rohan Agrawal Neeraj Gaur Ronny Huang Parisa Haghani P. M. Mengibar MoMe 198 1 0 17 Oct 2023
An Empirical Study of Consistency Regularization for End-to-End Speech-to-Text TranslationNorth American Chapter of the Association for Computational Linguistics (NAACL), 2023 Pengzhi Gao Ruiqing Zhang Zhongjun He Hua Wu Haifeng Wang 148 7 0 28 Aug 2023
Improving Joint Speech-Text Representations Without AlignmentInterspeech (Interspeech), 2023 Cal Peyser Zhong Meng Ke Hu Rohit Prabhavalkar Andrew Rosenberg Tara N. Sainath M. Picheny Dong Wang VLM 185 4 0 11 Aug 2023
Improving End-to-End Speech Translation by Imitation-Based Knowledge Distillation with Synthetic TranscriptsInternational Workshop on Spoken Language Translation (IWSLT), 2023 Rebekka Hubert Artem Sokolov Stefan Riezler 170 1 0 17 Jul 2023
Performance Comparison of Pre-trained Models for Speech-to-Text in Turkish: Whisper-Small and Wav2Vec2-XLS-R-300MTürkiye Bilişim Vakfı Bilgisayar Bilimleri ve Mühendisliği dergisi (TBBMD), 2023 Ö. B. Mercan Sercan Cepni D. E. Tasar ¸Sükrü Ozan VLM 93 1 0 06 Jul 2023
Recent Advances in Direct Speech-to-text TranslationInternational Joint Conference on Artificial Intelligence (IJCAI), 2023 Chen Xu Rong Ye Qianqian Dong Chengqi Zhao Tom Ko Mingxuan Wang Tong Xiao Jingbo Zhu 246 28 0 20 Jun 2023
End-to-End Simultaneous Speech Translation with Differentiable SegmentationAnnual Meeting of the Association for Computational Linguistics (ACL), 2023 Shaolei Zhang Yang Feng 162 25 0 25 May 2023
CMOT: Cross-modal Mixup via Optimal Transport for Speech TranslationAnnual Meeting of the Association for Computational Linguistics (ACL), 2023 Yan Zhou Qingkai Fang Yang Feng OT 292 40 0 24 May 2023
Improving speech translation by fusing speech and textConference on Empirical Methods in Natural Language Processing (EMNLP), 2023 Wenbiao Yin Zhicheng Liu Chengqi Zhao Tao Wang Jian-Fei Tong Rong Ye 168 4 0 23 May 2023
Back Translation for Speech-to-text Translation Without TranscriptsAnnual Meeting of the Association for Computational Linguistics (ACL), 2023 Qingkai Fang Yang Feng 156 16 0 15 May 2023
Understanding and Bridging the Modality Gap for Speech TranslationAnnual Meeting of the Association for Computational Linguistics (ACL), 2023 Qingkai Fang Yang Feng 209 29 0 15 May 2023
Hybrid Transducer and Attention based Encoder-Decoder Modeling for Speech-to-Text TasksAnnual Meeting of the Association for Computational Linguistics (ACL), 2023 Yun Tang Anna Y. Sun Hirofumi Inaguma Xinyue Chen Ning Dong Xutai Ma Paden Tomasello J. Pino 218 26 0 04 May 2023
Deep Transfer Learning for Automatic Speech Recognition: Towards Better GeneralizationKnowledge-Based Systems (KBS), 2023 Hamza Kheddar Yassine Himeur S. Al-Maadeed Abbes Amira F. Bensaali 250 115 0 27 Apr 2023
MixSpeech: Cross-Modality Self-Learning with Audio-Visual Stream Mixup for Visual Speech Translation and RecognitionIEEE International Conference on Computer Vision (ICCV), 2023 Xize Cheng Lin Li Tao Jin Rongjie Huang Wang Lin Zehan Wang Huangdai Liu Yejin Wang Aoxiong Yin Zhou Zhao 193 29 0 09 Mar 2023
Multi-task Highly Adaptive Lasso Ivana Malenica Rachael V. Phillips D. Lazzareschi Jeremy Coyle Romain Pirracchio Mark van der Laan 200 0 0 27 Jan 2023
Pre-training for Speech Translation: CTC Meets Optimal TransportInternational Conference on Machine Learning (ICML), 2023 Hang Le Hongyu Gong Changhan Wang J. Pino Benjamin Lecouteux D. Schwab OT 292 30 0 27 Jan 2023
$Mu$^{2}$SLAM: Multitask, Multilingual Speech and Language Models$ Mu $^{2}$ SLAM: Multitask, Multilingual Speech and Language ModelsInternational Conference on Machine Learning (ICML), 2022 Yong Cheng Yu Zhang Melvin Johnson Wolfgang Macherey Ankur Bapna 147 9 0 19 Dec 2022
WACO: Word-Aligned Contrastive Learning for Speech TranslationAnnual Meeting of the Association for Computational Linguistics (ACL), 2022 Siqi Ouyang Rong Ye Lei Li 273 34 0 19 Dec 2022
AdaTranS: Adapting with Boundary-based Shrinking for End-to-End Speech TranslationConference on Empirical Methods in Natural Language Processing (EMNLP), 2022 Xingshan Zeng Liangyou Li Qun Liu 133 6 0 17 Dec 2022
Improving End-to-end Speech Translation by Leveraging Auxiliary Speech and Text DataAAAI Conference on Artificial Intelligence (AAAI), 2022 Yuhao Zhang Chen Xu Bojie Hu Chunliang Zhang Tong Xiao Jingbo Zhu 142 17 0 04 Dec 2022
MMSpeech: Multi-modal Multi-task Encoder-Decoder Pre-training for Speech RecognitionInterspeech (Interspeech), 2022 Xiaohuan Zhou Jiaming Wang Zeyu Cui Shiliang Zhang Zhijie Yan Jingren Zhou Chang Zhou 204 13 0 29 Nov 2022
T5lephone: Bridging Speech and Text Self-supervised Models for Spoken Language Understanding via Phoneme level T5IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022 Chan-Jan Hsu Ho-Lam Chung Hung-yi Lee Yu Tsao 271 6 0 01 Nov 2022
Speech-text based multi-modal training with bidirectional attention for improved speech recognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022 Yuhang Yang Haihua Xu Hao-Ming Huang Eng Siong Chng Sheng Li 160 7 0 01 Nov 2022
Don't Discard Fixed-Window Audio Segmentation in Speech-to-Text TranslationConference on Machine Translation (WMT), 2022 Chantal Amrhein Barry Haddow 134 10 0 24 Oct 2022
Discrete Cross-Modal Alignment Enables Zero-Shot Speech TranslationConference on Empirical Methods in Natural Language Processing (EMNLP), 2022 Chen Wang Yuchen Liu Boxing Chen Jiajun Zhang Wei Luo Zhongqiang Huang Chengqing Zong 163 10 0 18 Oct 2022
Efficient acoustic feature transformation in mismatched environments using a Guided-GANSpeech Communication (Speech Commun.), 2022 Walter Heymans Marelie Hattingh Davel C. van Heerden 205 1 0 03 Oct 2022
Improving Deliberation by Text-Only and Semi-Supervised TrainingInterspeech (Interspeech), 2022 Ke Hu Tara N. Sainath Yanzhang He Rohit Prabhavalkar Trevor Strohman S. Mavandadi Weiran Wang 211 12 0 29 Jun 2022
Cross-modal Contrastive Learning for Speech TranslationNorth American Chapter of the Association for Computational Linguistics (NAACL), 2022 Rong Ye Mingxuan Wang Lei Li SSL 217 102 0 05 May 2022
Hear No Evil: Towards Adversarial Robustness of Automatic Speech Recognition via Multi-Task LearningInterspeech (Interspeech), 2022 Nilaksh Das Duen Horng Chau AAML 112 0 0 05 Apr 2022
STEMM: Self-learning with Speech-text Manifold Mixup for Speech TranslationAnnual Meeting of the Association for Computational Linguistics (ACL), 2022 Qingkai Fang Rong Ye Lei Li Yang Feng Mingxuan Wang 268 108 0 20 Mar 2022
Tackling data scarcity in speech translation using zero-shot multilingual machine translation techniquesIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022 Tu Anh Dinh Danni Liu Jan Niehues 139 7 0 26 Jan 2022
Optimizing Alignment of Speech and Language Latent Spaces for End-to-End Speech Recognition and UnderstandingIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021 Wei Wang Shuo Ren Yao Qian Shujie Liu Yu Shi Y. Qian Michael Zeng 131 21 0 23 Oct 2021
SpeechT5: Unified-Modal Encoder-Decoder Pre-Training for Spoken Language Processing Junyi Ao Rui Wang Long Zhou Chengyi Wang Shuo Ren ... Yu Zhang Zhihua Wei Yao Qian Jinyu Li Furu Wei 290 247 0 14 Oct 2021
ASR Rescoring and Confidence Estimation with ELECTRA Hayato Futami Hirofumi Inaguma Masato Mimura S. Sakai Tatsuya Kawahara KELM 178 22 0 05 Oct 2021
FST: the FAIR Speech Translation System for the IWSLT21 Multilingual Shared TaskInternational Workshop on Spoken Language Translation (IWSLT), 2021 Yun Tang Hongyu Gong Xian Li Changhan Wang J. Pino Holger Schwenk Naman Goyal 146 11 0 14 Jul 2021
Zero-shot Speech Translation Tu Anh Dinh 136 6 0 13 Jul 2021
Improving Speech Translation by Understanding and Learning from the Auxiliary Text Translation Task Yun Tang J. Pino Xian Li Changhan Wang Dmitriy Genzel 230 91 0 12 Jul 2021
The Volctrans Neural Speech Translation System for IWSLT 2021International Workshop on Spoken Language Translation (IWSLT), 2021 Chengqi Zhao Zhicheng Liu Jian-Fei Tong Tao Wang Mingxuan Wang Rong Ye Qianqian Dong Jun Cao Lei Li 158 8 0 16 May 2021
Learning Shared Semantic Space for Speech-to-Text TranslationFindings (Findings), 2021 Chi Han Mingxuan Wang Heng Ji Lei Li 321 84 0 07 May 2021
End-to-end Speech Translation via Cross-modal Progressive TrainingInterspeech (Interspeech), 2021 Rong Ye Mingxuan Wang Lei Li 212 78 0 21 Apr 2021
Large-Scale Self- and Semi-Supervised Learning for Speech TranslationInterspeech (Interspeech), 2021 Changhan Wang Anne Wu J. Pino Alexei Baevski Michael Auli Alexis Conneau SSL 172 47 0 14 Apr 2021