ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2010.11338
  4. Cited By
A General Multi-Task Learning Framework to Leverage Text Data for Speech
  to Text Tasks
v1v2 (latest)

A General Multi-Task Learning Framework to Leverage Text Data for Speech to Text Tasks

21 October 2020
Yun Tang
J. Pino
Changhan Wang
Xutai Ma
Dmitriy Genzel
ArXiv (abs)PDFHTML

Papers citing "A General Multi-Task Learning Framework to Leverage Text Data for Speech to Text Tasks"

49 / 49 papers shown
Title
Improving Language and Modality Transfer in Translation by Character-level Modeling
Improving Language and Modality Transfer in Translation by Character-level ModelingAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Ioannis Tsiamas
David Dale
Marta R. Costa-jussá
100
3
0
30 May 2025
LLaST: Improved End-to-end Speech Translation System Leveraged by Large
  Language Models
LLaST: Improved End-to-end Speech Translation System Leveraged by Large Language Models
Xi Chen
Songyang Zhang
Qibing Bai
Kai-xiang Chen
Satoshi Nakamura
AuLLM
268
17
0
22 Jul 2024
An Adapter-Based Unified Model for Multiple Spoken Language Processing
  Tasks
An Adapter-Based Unified Model for Multiple Spoken Language Processing Tasks
Varsha Suresh
Salah Ait-Mokhtar
Caroline Brun
Ioan Calapodescu
141
1
0
20 Jun 2024
CoSTA: Code-Switched Speech Translation using Aligned Speech-Text
  Interleaving
CoSTA: Code-Switched Speech Translation using Aligned Speech-Text Interleaving
Bhavani Shankar
Preethi Jyothi
Pushpak Bhattacharyya
246
4
0
16 Jun 2024
StreamSpeech: Simultaneous Speech-to-Speech Translation with Multi-task
  Learning
StreamSpeech: Simultaneous Speech-to-Speech Translation with Multi-task Learning
Shaolei Zhang
Qingkai Fang
Shoutao Guo
Zhengrui Ma
Min Zhang
Yang Feng
221
19
0
05 Jun 2024
Automatic Speech Recognition using Advanced Deep Learning Approaches: A
  survey
Automatic Speech Recognition using Advanced Deep Learning Approaches: A survey
Hamza Kheddar
Mustapha Hemis
Yassine Himeur
OffRL
194
131
0
02 Mar 2024
Pushing the Limits of Zero-shot End-to-End Speech Translation
Pushing the Limits of Zero-shot End-to-End Speech Translation
Ioannis Tsiamas
Gerard I. Gállego
José A. R. Fonollosa
Marta R. Costa-jussá
240
15
0
16 Feb 2024
Rethinking and Improving Multi-task Learning for End-to-end Speech
  Translation
Rethinking and Improving Multi-task Learning for End-to-end Speech TranslationConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Yuhao Zhang
Chen Xu
Bei Li
Hao Chen
Tong Xiao
Chunliang Zhang
Jingbo Zhu
165
9
0
07 Nov 2023
Audio-AdapterFusion: A Task-ID-free Approach for Efficient and
  Non-Destructive Multi-task Speech Recognition
Audio-AdapterFusion: A Task-ID-free Approach for Efficient and Non-Destructive Multi-task Speech RecognitionAutomatic Speech Recognition & Understanding (ASRU), 2023
Hillary Ngai
Rohan Agrawal
Neeraj Gaur
Ronny Huang
Parisa Haghani
P. M. Mengibar
MoMe
198
1
0
17 Oct 2023
An Empirical Study of Consistency Regularization for End-to-End
  Speech-to-Text Translation
An Empirical Study of Consistency Regularization for End-to-End Speech-to-Text TranslationNorth American Chapter of the Association for Computational Linguistics (NAACL), 2023
Pengzhi Gao
Ruiqing Zhang
Zhongjun He
Hua Wu
Haifeng Wang
148
7
0
28 Aug 2023
Improving Joint Speech-Text Representations Without Alignment
Improving Joint Speech-Text Representations Without AlignmentInterspeech (Interspeech), 2023
Cal Peyser
Zhong Meng
Ke Hu
Rohit Prabhavalkar
Andrew Rosenberg
Tara N. Sainath
M. Picheny
Dong Wang
VLM
185
4
0
11 Aug 2023
Improving End-to-End Speech Translation by Imitation-Based Knowledge
  Distillation with Synthetic Transcripts
Improving End-to-End Speech Translation by Imitation-Based Knowledge Distillation with Synthetic TranscriptsInternational Workshop on Spoken Language Translation (IWSLT), 2023
Rebekka Hubert
Artem Sokolov
Stefan Riezler
170
1
0
17 Jul 2023
Performance Comparison of Pre-trained Models for Speech-to-Text in
  Turkish: Whisper-Small and Wav2Vec2-XLS-R-300M
Performance Comparison of Pre-trained Models for Speech-to-Text in Turkish: Whisper-Small and Wav2Vec2-XLS-R-300MTürkiye Bilişim Vakfı Bilgisayar Bilimleri ve Mühendisliği dergisi (TBBMD), 2023
Ö. B. Mercan
Sercan Cepni
D. E. Tasar
¸Sükrü Ozan
VLM
93
1
0
06 Jul 2023
Recent Advances in Direct Speech-to-text Translation
Recent Advances in Direct Speech-to-text TranslationInternational Joint Conference on Artificial Intelligence (IJCAI), 2023
Chen Xu
Rong Ye
Qianqian Dong
Chengqi Zhao
Tom Ko
Mingxuan Wang
Tong Xiao
Jingbo Zhu
246
28
0
20 Jun 2023
End-to-End Simultaneous Speech Translation with Differentiable
  Segmentation
End-to-End Simultaneous Speech Translation with Differentiable SegmentationAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Shaolei Zhang
Yang Feng
162
25
0
25 May 2023
CMOT: Cross-modal Mixup via Optimal Transport for Speech Translation
CMOT: Cross-modal Mixup via Optimal Transport for Speech TranslationAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Yan Zhou
Qingkai Fang
Yang Feng
OT
292
40
0
24 May 2023
Improving speech translation by fusing speech and text
Improving speech translation by fusing speech and textConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Wenbiao Yin
Zhicheng Liu
Chengqi Zhao
Tao Wang
Jian-Fei Tong
Rong Ye
168
4
0
23 May 2023
Back Translation for Speech-to-text Translation Without Transcripts
Back Translation for Speech-to-text Translation Without TranscriptsAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Qingkai Fang
Yang Feng
156
16
0
15 May 2023
Understanding and Bridging the Modality Gap for Speech Translation
Understanding and Bridging the Modality Gap for Speech TranslationAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Qingkai Fang
Yang Feng
209
29
0
15 May 2023
Hybrid Transducer and Attention based Encoder-Decoder Modeling for
  Speech-to-Text Tasks
Hybrid Transducer and Attention based Encoder-Decoder Modeling for Speech-to-Text TasksAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Yun Tang
Anna Y. Sun
Hirofumi Inaguma
Xinyue Chen
Ning Dong
Xutai Ma
Paden Tomasello
J. Pino
218
26
0
04 May 2023
Deep Transfer Learning for Automatic Speech Recognition: Towards Better
  Generalization
Deep Transfer Learning for Automatic Speech Recognition: Towards Better GeneralizationKnowledge-Based Systems (KBS), 2023
Hamza Kheddar
Yassine Himeur
S. Al-Maadeed
Abbes Amira
F. Bensaali
250
115
0
27 Apr 2023
MixSpeech: Cross-Modality Self-Learning with Audio-Visual Stream Mixup
  for Visual Speech Translation and Recognition
MixSpeech: Cross-Modality Self-Learning with Audio-Visual Stream Mixup for Visual Speech Translation and RecognitionIEEE International Conference on Computer Vision (ICCV), 2023
Xize Cheng
Lin Li
Tao Jin
Rongjie Huang
Wang Lin
Zehan Wang
Huangdai Liu
Yejin Wang
Aoxiong Yin
Zhou Zhao
193
29
0
09 Mar 2023
Multi-task Highly Adaptive Lasso
Multi-task Highly Adaptive Lasso
Ivana Malenica
Rachael V. Phillips
D. Lazzareschi
Jeremy Coyle
Romain Pirracchio
Mark van der Laan
200
0
0
27 Jan 2023
Pre-training for Speech Translation: CTC Meets Optimal Transport
Pre-training for Speech Translation: CTC Meets Optimal TransportInternational Conference on Machine Learning (ICML), 2023
Hang Le
Hongyu Gong
Changhan Wang
J. Pino
Benjamin Lecouteux
D. Schwab
OT
292
30
0
27 Jan 2023
Mu$^{2}$SLAM: Multitask, Multilingual Speech and Language Models
Mu2^{2}2SLAM: Multitask, Multilingual Speech and Language ModelsInternational Conference on Machine Learning (ICML), 2022
Yong Cheng
Yu Zhang
Melvin Johnson
Wolfgang Macherey
Ankur Bapna
147
9
0
19 Dec 2022
WACO: Word-Aligned Contrastive Learning for Speech Translation
WACO: Word-Aligned Contrastive Learning for Speech TranslationAnnual Meeting of the Association for Computational Linguistics (ACL), 2022
Siqi Ouyang
Rong Ye
Lei Li
273
34
0
19 Dec 2022
AdaTranS: Adapting with Boundary-based Shrinking for End-to-End Speech
  Translation
AdaTranS: Adapting with Boundary-based Shrinking for End-to-End Speech TranslationConference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Xingshan Zeng
Liangyou Li
Qun Liu
133
6
0
17 Dec 2022
Improving End-to-end Speech Translation by Leveraging Auxiliary Speech
  and Text Data
Improving End-to-end Speech Translation by Leveraging Auxiliary Speech and Text DataAAAI Conference on Artificial Intelligence (AAAI), 2022
Yuhao Zhang
Chen Xu
Bojie Hu
Chunliang Zhang
Tong Xiao
Jingbo Zhu
142
17
0
04 Dec 2022
MMSpeech: Multi-modal Multi-task Encoder-Decoder Pre-training for Speech
  Recognition
MMSpeech: Multi-modal Multi-task Encoder-Decoder Pre-training for Speech RecognitionInterspeech (Interspeech), 2022
Xiaohuan Zhou
Jiaming Wang
Zeyu Cui
Shiliang Zhang
Zhijie Yan
Jingren Zhou
Chang Zhou
204
13
0
29 Nov 2022
T5lephone: Bridging Speech and Text Self-supervised Models for Spoken
  Language Understanding via Phoneme level T5
T5lephone: Bridging Speech and Text Self-supervised Models for Spoken Language Understanding via Phoneme level T5IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Chan-Jan Hsu
Ho-Lam Chung
Hung-yi Lee
Yu Tsao
271
6
0
01 Nov 2022
Speech-text based multi-modal training with bidirectional attention for
  improved speech recognition
Speech-text based multi-modal training with bidirectional attention for improved speech recognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Yuhang Yang
Haihua Xu
Hao-Ming Huang
Eng Siong Chng
Sheng Li
160
7
0
01 Nov 2022
Don't Discard Fixed-Window Audio Segmentation in Speech-to-Text
  Translation
Don't Discard Fixed-Window Audio Segmentation in Speech-to-Text TranslationConference on Machine Translation (WMT), 2022
Chantal Amrhein
Barry Haddow
134
10
0
24 Oct 2022
Discrete Cross-Modal Alignment Enables Zero-Shot Speech Translation
Discrete Cross-Modal Alignment Enables Zero-Shot Speech TranslationConference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Chen Wang
Yuchen Liu
Boxing Chen
Jiajun Zhang
Wei Luo
Zhongqiang Huang
Chengqing Zong
163
10
0
18 Oct 2022
Efficient acoustic feature transformation in mismatched environments
  using a Guided-GAN
Efficient acoustic feature transformation in mismatched environments using a Guided-GANSpeech Communication (Speech Commun.), 2022
Walter Heymans
Marelie Hattingh Davel
C. van Heerden
205
1
0
03 Oct 2022
Improving Deliberation by Text-Only and Semi-Supervised Training
Improving Deliberation by Text-Only and Semi-Supervised TrainingInterspeech (Interspeech), 2022
Ke Hu
Tara N. Sainath
Yanzhang He
Rohit Prabhavalkar
Trevor Strohman
S. Mavandadi
Weiran Wang
211
12
0
29 Jun 2022
Cross-modal Contrastive Learning for Speech Translation
Cross-modal Contrastive Learning for Speech TranslationNorth American Chapter of the Association for Computational Linguistics (NAACL), 2022
Rong Ye
Mingxuan Wang
Lei Li
SSL
217
102
0
05 May 2022
Hear No Evil: Towards Adversarial Robustness of Automatic Speech
  Recognition via Multi-Task Learning
Hear No Evil: Towards Adversarial Robustness of Automatic Speech Recognition via Multi-Task LearningInterspeech (Interspeech), 2022
Nilaksh Das
Duen Horng Chau
AAML
112
0
0
05 Apr 2022
STEMM: Self-learning with Speech-text Manifold Mixup for Speech
  Translation
STEMM: Self-learning with Speech-text Manifold Mixup for Speech TranslationAnnual Meeting of the Association for Computational Linguistics (ACL), 2022
Qingkai Fang
Rong Ye
Lei Li
Yang Feng
Mingxuan Wang
268
108
0
20 Mar 2022
Tackling data scarcity in speech translation using zero-shot
  multilingual machine translation techniques
Tackling data scarcity in speech translation using zero-shot multilingual machine translation techniquesIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Tu Anh Dinh
Danni Liu
Jan Niehues
139
7
0
26 Jan 2022
Optimizing Alignment of Speech and Language Latent Spaces for End-to-End
  Speech Recognition and Understanding
Optimizing Alignment of Speech and Language Latent Spaces for End-to-End Speech Recognition and UnderstandingIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021
Wei Wang
Shuo Ren
Yao Qian
Shujie Liu
Yu Shi
Y. Qian
Michael Zeng
131
21
0
23 Oct 2021
SpeechT5: Unified-Modal Encoder-Decoder Pre-Training for Spoken Language
  Processing
SpeechT5: Unified-Modal Encoder-Decoder Pre-Training for Spoken Language Processing
Junyi Ao
Rui Wang
Long Zhou
Chengyi Wang
Shuo Ren
...
Yu Zhang
Zhihua Wei
Yao Qian
Jinyu Li
Furu Wei
290
247
0
14 Oct 2021
ASR Rescoring and Confidence Estimation with ELECTRA
ASR Rescoring and Confidence Estimation with ELECTRA
Hayato Futami
Hirofumi Inaguma
Masato Mimura
S. Sakai
Tatsuya Kawahara
KELM
178
22
0
05 Oct 2021
FST: the FAIR Speech Translation System for the IWSLT21 Multilingual
  Shared Task
FST: the FAIR Speech Translation System for the IWSLT21 Multilingual Shared TaskInternational Workshop on Spoken Language Translation (IWSLT), 2021
Yun Tang
Hongyu Gong
Xian Li
Changhan Wang
J. Pino
Holger Schwenk
Naman Goyal
146
11
0
14 Jul 2021
Zero-shot Speech Translation
Zero-shot Speech Translation
Tu Anh Dinh
136
6
0
13 Jul 2021
Improving Speech Translation by Understanding and Learning from the
  Auxiliary Text Translation Task
Improving Speech Translation by Understanding and Learning from the Auxiliary Text Translation Task
Yun Tang
J. Pino
Xian Li
Changhan Wang
Dmitriy Genzel
230
91
0
12 Jul 2021
The Volctrans Neural Speech Translation System for IWSLT 2021
The Volctrans Neural Speech Translation System for IWSLT 2021International Workshop on Spoken Language Translation (IWSLT), 2021
Chengqi Zhao
Zhicheng Liu
Jian-Fei Tong
Tao Wang
Mingxuan Wang
Rong Ye
Qianqian Dong
Jun Cao
Lei Li
158
8
0
16 May 2021
Learning Shared Semantic Space for Speech-to-Text Translation
Learning Shared Semantic Space for Speech-to-Text TranslationFindings (Findings), 2021
Chi Han
Mingxuan Wang
Heng Ji
Lei Li
321
84
0
07 May 2021
End-to-end Speech Translation via Cross-modal Progressive Training
End-to-end Speech Translation via Cross-modal Progressive TrainingInterspeech (Interspeech), 2021
Rong Ye
Mingxuan Wang
Lei Li
212
78
0
21 Apr 2021
Large-Scale Self- and Semi-Supervised Learning for Speech Translation
Large-Scale Self- and Semi-Supervised Learning for Speech TranslationInterspeech (Interspeech), 2021
Changhan Wang
Anne Wu
J. Pino
Alexei Baevski
Michael Auli
Alexis Conneau
SSL
172
47
0
14 Apr 2021
1