ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1811.02050
  4. Cited By
Leveraging Weakly Supervised Data to Improve End-to-End Speech-to-Text
  Translation

Leveraging Weakly Supervised Data to Improve End-to-End Speech-to-Text Translation

5 November 2018
Ye Jia
Melvin Johnson
Wolfgang Macherey
Ron J. Weiss
Yuan Cao
Chung-Cheng Chiu
Naveen Ari
Stella Laurenzo
Yonghui Wu
ArXivPDFHTML

Papers citing "Leveraging Weakly Supervised Data to Improve End-to-End Speech-to-Text Translation"

50 / 100 papers shown
Title
AdaST: Dynamically Adapting Encoder States in the Decoder for End-to-End Speech-to-Text Translation
AdaST: Dynamically Adapting Encoder States in the Decoder for End-to-End Speech-to-Text Translation
Wuwei Huang
Dexin Wang
Deyi Xiong
72
4
0
18 Mar 2025
Joint Training And Decoding for Multilingual End-to-End Simultaneous Speech Translation
Wuwei Huang
Renren Jin
Wen Zhang
Jian Luan
Bin Wang
Deyi Xiong
61
1
0
14 Mar 2025
When End-to-End is Overkill: Rethinking Cascaded Speech-to-Text Translation
When End-to-End is Overkill: Rethinking Cascaded Speech-to-Text Translation
Anna Min
Chenxu Hu
Yi Ren
Hang Zhao
69
0
0
01 Feb 2025
CTC-GMM: CTC guided modality matching for fast and accurate streaming
  speech translation
CTC-GMM: CTC guided modality matching for fast and accurate streaming speech translation
Rui Zhao
Jinyu Li
Ruchao Fan
Matt Post
38
1
0
07 Oct 2024
MOSEL: 950,000 Hours of Speech Data for Open-Source Speech Foundation
  Model Training on EU Languages
MOSEL: 950,000 Hours of Speech Data for Open-Source Speech Foundation Model Training on EU Languages
Marco Gaido
Sara Papi
L. Bentivogli
A. Brutti
Mauro Cettolo
R. Gretter
M. Matassoni
Mohamed Nabih
Matteo Negri
39
0
0
01 Oct 2024
SimulTron: On-Device Simultaneous Speech to Speech Translation
SimulTron: On-Device Simultaneous Speech to Speech Translation
A. Agranovich
Eliya Nachmani
Oleg Rybakov
Yifan Ding
Ye Jia
Nadav Bar
Heiga Zen
Michelle Tadmor Ramanovich
44
0
0
04 Jun 2024
Pushing the Limits of Zero-shot End-to-End Speech Translation
Pushing the Limits of Zero-shot End-to-End Speech Translation
Ioannis Tsiamas
Gerard I. Gállego
José A. R. Fonollosa
Marta R. Costa-jussá
43
7
0
16 Feb 2024
AV2AV: Direct Audio-Visual Speech to Audio-Visual Speech Translation
  with Unified Audio-Visual Speech Representation
AV2AV: Direct Audio-Visual Speech to Audio-Visual Speech Translation with Unified Audio-Visual Speech Representation
J. Choi
Se Jin Park
Minsu Kim
Y. Ro
25
12
0
05 Dec 2023
End-to-End Speech-to-Text Translation: A Survey
End-to-End Speech-to-Text Translation: A Survey
Nivedita Sethiya
Chandresh Kumar Maurya
24
7
0
02 Dec 2023
Regret-Optimal Federated Transfer Learning for Kernel Regression with
  Applications in American Option Pricing
Regret-Optimal Federated Transfer Learning for Kernel Regression with Applications in American Option Pricing
Xuwei Yang
Anastasis Kratsios
Florian Krach
Matheus Grasselli
Aurélien Lucchi
FedML
21
2
0
08 Sep 2023
Sparks of Large Audio Models: A Survey and Outlook
Sparks of Large Audio Models: A Survey and Outlook
S. Latif
Moazzam Shoukat
Fahad Shamshad
Muhammad Usama
Yi Ren
...
Wenwu Wang
Xulong Zhang
Roberto Togneri
Erik Cambria
Björn W. Schuller
LM&MA
AuLLM
31
38
0
24 Aug 2023
Improving End-to-End Speech Translation by Imitation-Based Knowledge
  Distillation with Synthetic Transcripts
Improving End-to-End Speech Translation by Imitation-Based Knowledge Distillation with Synthetic Transcripts
Rebekka Hubert
Artem Sokolov
Stefan Riezler
19
1
0
17 Jul 2023
AudioPaLM: A Large Language Model That Can Speak and Listen
AudioPaLM: A Large Language Model That Can Speak and Listen
Paul Kishan Rubenstein
Chulayuth Asawaroengchai
D. Nguyen
Ankur Bapna
Zalan Borsos
...
Neil Zeghidour
Yu Zhang
Zhishuai Zhang
Lukás Zilka
Christian Frank
LM&MA
AuLLM
VLM
35
258
0
22 Jun 2023
Recent Advances in Direct Speech-to-text Translation
Recent Advances in Direct Speech-to-text Translation
Chen Xu
Rong Ye
Qianqian Dong
Chengqi Zhao
Tom Ko
Mingxuan Wang
Tong Xiao
Jingbo Zhu
19
18
0
20 Jun 2023
Translatotron 3: Speech to Speech Translation with Monolingual Data
Translatotron 3: Speech to Speech Translation with Monolingual Data
Eliya Nachmani
Alon Levkovitch
Yi-Yang Ding
Chulayutsh Asawaroengchai
Heiga Zen
Michelle Tadmor Ramanovich
21
14
0
27 May 2023
ComSL: A Composite Speech-Language Model for End-to-End Speech-to-Text
  Translation
ComSL: A Composite Speech-Language Model for End-to-End Speech-to-Text Translation
Chenyang Le
Yao Qian
Long Zhou
Shujie Liu
Yanmin Qian
Michael Zeng
Xuedong Huang
24
13
0
24 May 2023
Improving speech translation by fusing speech and text
Improving speech translation by fusing speech and text
Wenbiao Yin
Zhicheng Liu
Chengqi Zhao
Tao Wang
Jian-Fei Tong
Rong Ye
15
4
0
23 May 2023
Back Translation for Speech-to-text Translation Without Transcripts
Back Translation for Speech-to-text Translation Without Transcripts
Qingkai Fang
Yang Feng
30
13
0
15 May 2023
Understanding and Bridging the Modality Gap for Speech Translation
Understanding and Bridging the Modality Gap for Speech Translation
Qingkai Fang
Yang Feng
27
25
0
15 May 2023
Improving Speech Translation by Cross-Modal Multi-Grained Contrastive
  Learning
Improving Speech Translation by Cross-Modal Multi-Grained Contrastive Learning
Hao Zhang
Nianwen Si
Yaqi Chen
Wenlin Zhang
Xukui Yang
Dan Qu
Weiqiang Zhang
35
9
0
20 Apr 2023
Leveraging Large Text Corpora for End-to-End Speech Summarization
Leveraging Large Text Corpora for End-to-End Speech Summarization
Kohei Matsuura
Takanori Ashihara
Takafumi Moriya
Tomohiro Tanaka
A. Ogawa
Marc Delcroix
Ryo Masumura
27
14
0
02 Mar 2023
WhisperX: Time-Accurate Speech Transcription of Long-Form Audio
WhisperX: Time-Accurate Speech Transcription of Long-Form Audio
Max Bain
Jaesung Huh
Tengda Han
Andrew Zisserman
26
203
0
01 Mar 2023
SegAugment: Maximizing the Utility of Speech Translation Data with
  Segmentation-based Augmentations
SegAugment: Maximizing the Utility of Speech Translation Data with Segmentation-based Augmentations
Ioannis Tsiamas
José A. R. Fonollosa
Marta R. Costa-jussá
41
6
0
19 Dec 2022
WACO: Word-Aligned Contrastive Learning for Speech Translation
WACO: Word-Aligned Contrastive Learning for Speech Translation
Siqi Ouyang
Rong Ye
Lei Li
29
25
0
19 Dec 2022
UnitY: Two-pass Direct Speech-to-speech Translation with Discrete Units
UnitY: Two-pass Direct Speech-to-speech Translation with Discrete Units
H. Inaguma
Sravya Popuri
Ilia Kulikov
Peng-Jen Chen
Changhan Wang
Yu-An Chung
Yun Tang
Ann Lee
Shinji Watanabe
J. Pino
43
51
0
15 Dec 2022
Improving End-to-end Speech Translation by Leveraging Auxiliary Speech
  and Text Data
Improving End-to-end Speech Translation by Leveraging Auxiliary Speech and Text Data
Yuhao Zhang
Chen Xu
Bojie Hu
Chunliang Zhang
Tong Xiao
Jingbo Zhu
18
15
0
04 Dec 2022
Efficient Speech Translation with Pre-trained Models
Efficient Speech Translation with Pre-trained Models
Zhaolin Li
J. Niehues
19
2
0
09 Nov 2022
A Weakly-Supervised Streaming Multilingual Speech Model with Truly
  Zero-Shot Capability
A Weakly-Supervised Streaming Multilingual Speech Model with Truly Zero-Shot Capability
Jian Xue
Peidong Wang
Jinyu Li
Eric Sun
29
10
0
04 Nov 2022
Improving Speech-to-Speech Translation Through Unlabeled Text
Improving Speech-to-Speech Translation Through Unlabeled Text
Xuan-Phi Nguyen
Sravya Popuri
Changhan Wang
Yun Tang
Ilia Kulikov
Hongyu Gong
17
9
0
26 Oct 2022
Discrete Cross-Modal Alignment Enables Zero-Shot Speech Translation
Discrete Cross-Modal Alignment Enables Zero-Shot Speech Translation
Chen Wang
Yuchen Liu
Boxing Chen
Jiajun Zhang
Wei Luo
Zhongqiang Huang
Chengqing Zong
31
10
0
18 Oct 2022
Generating Synthetic Speech from SpokenVocab for Speech Translation
Generating Synthetic Speech from SpokenVocab for Speech Translation
Jinming Zhao
Gholamreza Haffar
Ehsan Shareghi
11
5
0
15 Oct 2022
The YiTrans End-to-End Speech Translation System for IWSLT 2022 Offline
  Shared Task
The YiTrans End-to-End Speech Translation System for IWSLT 2022 Offline Shared Task
Ziqiang Zhang
Junyi Ao
Long Zhou
Shujie Liu
Furu Wei
Jinyu Li
17
9
0
12 Jun 2022
Leveraging Pseudo-labeled Data to Improve Direct Speech-to-Speech
  Translation
Leveraging Pseudo-labeled Data to Improve Direct Speech-to-Speech Translation
Qianqian Dong
Fengpeng Yue
Tom Ko
Mingxuan Wang
Qibing Bai
Yu Zhang
32
16
0
18 May 2022
Large-Scale Streaming End-to-End Speech Translation with Neural
  Transducers
Large-Scale Streaming End-to-End Speech Translation with Neural Transducers
Jian Xue
Peidong Wang
Jinyu Li
Matt Post
Yashesh Gaur
AI4TS
24
26
0
11 Apr 2022
GigaST: A 10,000-hour Pseudo Speech Translation Corpus
GigaST: A 10,000-hour Pseudo Speech Translation Corpus
Rong Ye
Chengqi Zhao
Tom Ko
Chutong Meng
Tao Wang
Mingxuan Wang
Jun Cao
9
23
0
08 Apr 2022
Enhanced Direct Speech-to-Speech Translation Using Self-supervised
  Pre-training and Data Augmentation
Enhanced Direct Speech-to-Speech Translation Using Self-supervised Pre-training and Data Augmentation
Sravya Popuri
Peng-Jen Chen
Changhan Wang
J. Pino
Yossi Adi
Jiatao Gu
Wei-Ning Hsu
Ann Lee
20
56
0
06 Apr 2022
Leveraging unsupervised and weakly-supervised data to improve direct
  speech-to-speech translation
Leveraging unsupervised and weakly-supervised data to improve direct speech-to-speech translation
Ye Jia
Yifan Ding
Ankur Bapna
Colin Cherry
Yu Zhang
Alexis Conneau
Nobuyuki Morioka
39
20
0
24 Mar 2022
STEMM: Self-learning with Speech-text Manifold Mixup for Speech
  Translation
STEMM: Self-learning with Speech-text Manifold Mixup for Speech Translation
Qingkai Fang
Rong Ye
Lei Li
Yang Feng
Mingxuan Wang
22
95
0
20 Mar 2022
Under the Morphosyntactic Lens: A Multifaceted Evaluation of Gender Bias
  in Speech Translation
Under the Morphosyntactic Lens: A Multifaceted Evaluation of Gender Bias in Speech Translation
Beatrice Savoldi
Marco Gaido
L. Bentivogli
Matteo Negri
Marco Turchi
38
26
0
18 Mar 2022
Sample, Translate, Recombine: Leveraging Audio Alignments for Data
  Augmentation in End-to-end Speech Translation
Sample, Translate, Recombine: Leveraging Audio Alignments for Data Augmentation in End-to-end Speech Translation
Tsz Kin Lam
Shigehiko Schamoni
Stefan Riezler
17
32
0
16 Mar 2022
Tackling data scarcity in speech translation using zero-shot
  multilingual machine translation techniques
Tackling data scarcity in speech translation using zero-shot multilingual machine translation techniques
Tu Anh Dinh
Danni Liu
J. Niehues
24
6
0
26 Jan 2022
CVSS Corpus and Massively Multilingual Speech-to-Speech Translation
CVSS Corpus and Massively Multilingual Speech-to-Speech Translation
Yeting Jia
Michelle Tadmor Ramanovich
Quan Wang
Heiga Zen
SLR
28
66
0
11 Jan 2022
Optimizing Alignment of Speech and Language Latent Spaces for End-to-End
  Speech Recognition and Understanding
Optimizing Alignment of Speech and Language Latent Spaces for End-to-End Speech Recognition and Understanding
Wei Wang
Shuo Ren
Yao Qian
Shujie Liu
Yu Shi
Y. Qian
Michael Zeng
32
16
0
23 Oct 2021
Learning When to Translate for Streaming Speech
Learning When to Translate for Streaming Speech
Qianqian Dong
Yaoming Zhu
Mingxuan Wang
Lei Li
50
29
0
15 Sep 2021
Non-autoregressive End-to-end Speech Translation with Parallel
  Autoregressive Rescoring
Non-autoregressive End-to-end Speech Translation with Parallel Autoregressive Rescoring
H. Inaguma
Yosuke Higuchi
Kevin Duh
Tatsuya Kawahara
Shinji Watanabe
55
11
0
09 Sep 2021
Translatotron 2: High-quality direct speech-to-speech translation with
  voice preservation
Translatotron 2: High-quality direct speech-to-speech translation with voice preservation
Ye Jia
Michelle Tadmor Ramanovich
Tal Remez
Roi Pomerantz
26
67
0
19 Jul 2021
Zero-shot Speech Translation
Zero-shot Speech Translation
Tu Anh Dinh
25
6
0
13 Jul 2021
Direct speech-to-speech translation with discrete units
Direct speech-to-speech translation with discrete units
Ann Lee
Peng-Jen Chen
Changhan Wang
Jiatao Gu
Sravya Popuri
...
Yossi Adi
Qing He
Yun Tang
J. Pino
Wei-Ning Hsu
27
180
0
12 Jul 2021
Dealing with training and test segmentation mismatch: FBK@IWSLT2021
Dealing with training and test segmentation mismatch: FBK@IWSLT2021
Sara Papi
Marco Gaido
Matteo Negri
Marco Turchi
31
6
0
23 Jun 2021
RealTranS: End-to-End Simultaneous Speech Translation with Convolutional
  Weighted-Shrinking Transformer
RealTranS: End-to-End Simultaneous Speech Translation with Convolutional Weighted-Shrinking Transformer
Xingshan Zeng
Liangyou Li
Qun Liu
17
45
0
09 Jun 2021
12
Next