Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1703.08581
Cited By
Sequence-to-Sequence Models Can Directly Translate Foreign Speech
24 March 2017
Ron J. Weiss
J. Chorowski
Navdeep Jaitly
Yonghui Wu
Zhehuai Chen
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Sequence-to-Sequence Models Can Directly Translate Foreign Speech"
50 / 204 papers shown
Title
DWIM: Towards Tool-aware Visual Reasoning via Discrepancy-aware Workflow Generation & Instruct-Masking Tuning
Fucai Ke
Vijay Kumar B G
Xingjian Leng
Zhixi Cai
Zaid Khan
Weiqing Wang
P. D. Haghighi
H. Rezatofighi
Manmohan Chandraker
51
0
0
25 Mar 2025
AdaST: Dynamically Adapting Encoder States in the Decoder for End-to-End Speech-to-Text Translation
Wuwei Huang
Dexin Wang
Deyi Xiong
72
4
0
18 Mar 2025
Joint Training And Decoding for Multilingual End-to-End Simultaneous Speech Translation
Wuwei Huang
Renren Jin
Wen Zhang
Jian Luan
Bin Wang
Deyi Xiong
69
1
0
14 Mar 2025
MAVFlow: Preserving Paralinguistic Elements with Conditional Flow Matching for Zero-Shot AV2AV Multilingual Translation
Sungwoo Cho
J. Choi
Sungnyun Kim
Se-Young Yun
65
0
0
14 Mar 2025
Speech to Speech Translation with Translatotron: A State of the Art Review
Jules R. Kala
Emmanuel Adetiba
Abdultaofeek Abayom
Oluwatobi E. Dare
Ayodele H. Ifijeh
163
0
0
21 Feb 2025
High-Fidelity Simultaneous Speech-To-Speech Translation
Tom Labiausse
Laurent Mazaré
Edouard Grave
P. Pérez
Alexandre Défossez
Neil Zeghidour
245
0
0
05 Feb 2025
Prepending or Cross-Attention for Speech-to-Text? An Empirical Comparison
Tsz Kin Lam
Marco Gaido
Sara Papi
L. Bentivogli
Barry Haddow
36
0
0
04 Jan 2025
BhasaAnuvaad: A Speech Translation Dataset for 13 Indian Languages
Sparsh Jain
Ashwin Sankar
Devilal Choudhary
Dhairya Suman
Nikhil Narasimhan
Mohammed Safi Ur Rahman Khan
Anoop Kunchukuttan
Mitesh M. Khapra
Raj Dabre
47
2
0
07 Nov 2024
LLM-Ref: Enhancing Reference Handling in Technical Writing with Large Language Models
Kazi Ahmed Asif Fuad
Lizhong Chen
26
0
0
01 Nov 2024
CTC-GMM: CTC guided modality matching for fast and accurate streaming speech translation
Rui Zhao
Jinyu Li
Ruchao Fan
Matt Post
41
1
0
07 Oct 2024
CoT-ST: Enhancing LLM-based Speech Translation with Multimodal Chain-of-Thought
Yexing Du
Ziyang Ma
Yifan Yang
Keqi Deng
Xie Chen
Bo Yang
Yang Xiang
Ming Liu
Bing Qin
LRM
26
6
0
29 Sep 2024
LLaST: Improved End-to-end Speech Translation System Leveraged by Large Language Models
Xi Chen
Songyang Zhang
Qibing Bai
Kai-xiang Chen
Satoshi Nakamura
AuLLM
37
6
0
22 Jul 2024
Exploring the Capability of Mamba in Speech Applications
Koichi Miyazaki
Yoshiki Masuyama
Masato Murata
Mamba
40
12
0
24 Jun 2024
CoSTA: Code-Switched Speech Translation using Aligned Speech-Text Interleaving
Bhavani Shankar
Preethi Jyothi
Pushpak Bhattacharyya
50
1
0
16 Jun 2024
Lightweight Audio Segmentation for Long-form Speech Translation
Jaesong Lee
Soyoon Kim
Hanbyul Kim
Joon Son Chung
38
0
0
15 Jun 2024
Simul-Whisper: Attention-Guided Streaming Whisper with Truncation Detection
Haoyu Wang
Guoqiang Hu
Guodong Lin
Wei-Qiang Zhang
Jian Li
30
1
0
14 Jun 2024
Soft Language Identification for Language-Agnostic Many-to-One End-to-End Speech Translation
Peidong Wang
Jian Xue
Jinyu Li
Junkun Chen
Aswin Shanmugam Subramanian
31
0
0
12 Jun 2024
StreamAtt: Direct Streaming Speech-to-Text Translation with Attention-based Audio History Selection
Sara Papi
Marco Gaido
Matteo Negri
L. Bentivogli
79
4
0
10 Jun 2024
SBAAM! Eliminating Transcript Dependency in Automatic Subtitling
Marco Gaido
Sara Papi
Matteo Negri
Mauro Cettolo
L. Bentivogli
43
1
0
17 May 2024
A Survey of Generative Techniques for Spatial-Temporal Data Mining
Qianru Zhang
Haixin Wang
Cheng Long
Liangcai Su
Xingwei He
...
Tailin Wu
Hongzhi Yin
Siu-Ming Yiu
Qi Tian
Christian S. Jensen
AI4TS
57
7
0
15 May 2024
Efficient Monotonic Multihead Attention
Xutai Ma
Anna Y. Sun
Siqi Ouyang
Hirofumi Inaguma
Paden Tomasello
46
4
0
07 Dec 2023
End-to-End Speech-to-Text Translation: A Survey
Nivedita Sethiya
Chandresh Kumar Maurya
32
7
0
02 Dec 2023
Rethinking and Improving Multi-task Learning for End-to-end Speech Translation
Yuhao Zhang
Chen Xu
Bei Li
Hao Chen
Tong Xiao
Chunliang Zhang
Jingbo Zhu
26
6
0
07 Nov 2023
End-to-End Single-Channel Speaker-Turn Aware Conversational Speech Translation
Juan Pablo Zuluaga
Zhaocheng Huang
Xing Niu
Rohit Paturi
S. Srinivasan
Prashant Mathur
Brian Thompson
Marcello Federico
BDL
35
2
0
01 Nov 2023
Towards a Deep Understanding of Multilingual End-to-End Speech Translation
Haoran Sun
Xiaohu Zhao
Yikun Lei
Shaolin Zhu
Deyi Xiong
42
8
0
31 Oct 2023
Modular Speech-to-Text Translation for Zero-Shot Cross-Modal Transfer
Paul-Ambroise Duquenne
Holger Schwenk
Benoît Sagot
50
3
0
05 Oct 2023
Tuning Large language model for End-to-end Speech Translation
Hao Zhang
Nianwen Si
Yaqi Chen
Wenlin Zhang
Xu Yang
Dan Qu
Xiaolin Jiao
20
8
0
03 Oct 2023
Direct Models for Simultaneous Translation and Automatic Subtitling: FBK@IWSLT2023
Sara Papi
Marco Gaido
Matteo Negri
43
7
0
27 Sep 2023
Improving End-to-End Speech Translation by Imitation-Based Knowledge Distillation with Synthetic Transcripts
Rebekka Hubert
Artem Sokolov
Stefan Riezler
27
1
0
17 Jul 2023
On decoder-only architecture for speech-to-text and large language model integration
Jian Wu
Yashesh Gaur
Zhuo Chen
Long Zhou
Yilun Zhu
...
Jinyu Li
Shujie Liu
Bo Ren
Linquan Liu
Yu-Huan Wu
AuLLM
33
121
0
08 Jul 2023
Recent Advances in Direct Speech-to-text Translation
Chen Xu
Rong Ye
Qianqian Dong
Chengqi Zhao
Tom Ko
Mingxuan Wang
Tong Xiao
Jingbo Zhu
27
18
0
20 Jun 2023
Modality Adaption or Regularization? A Case Study on End-to-End Speech Translation
Yucheng Han
Chen Xu
Tong Xiao
Jingbo Zhu
30
3
0
13 Jun 2023
STT4SG-350: A Speech Corpus for All Swiss German Dialect Regions
Michel Plüss
Jan Deriu
Yanick Schraner
Claudio Paonessa
Julia Hartmann
...
Christian Scheller
Manuela Hurlimann
Tanja Samardvzić
Manfred Vogel
Mark Cieliebak
28
16
0
30 May 2023
CTC-based Non-autoregressive Speech Translation
Chen Xu
Xiaoqian Liu
Xiaowen Liu
Qingxuan Sun
Yuhao Zhang
...
Tom Ko
Mingxuan Wang
Tong Xiao
Anxiang Ma
Jingbo Zhu
25
11
0
27 May 2023
End-to-End Simultaneous Speech Translation with Differentiable Segmentation
Shaolei Zhang
Yang Feng
32
17
0
25 May 2023
ComSL: A Composite Speech-Language Model for End-to-End Speech-to-Text Translation
Chenyang Le
Yao Qian
Long Zhou
Shujie Liu
Yanmin Qian
Michael Zeng
Xuedong Huang
24
13
0
24 May 2023
Improving speech translation by fusing speech and text
Wenbiao Yin
Zhicheng Liu
Chengqi Zhao
Tao Wang
Jian-Fei Tong
Rong Ye
15
4
0
23 May 2023
Duplex Diffusion Models Improve Speech-to-Speech Translation
Xianchao Wu
DiffM
25
4
0
22 May 2023
Understanding and Bridging the Modality Gap for Speech Translation
Qingkai Fang
Yang Feng
32
25
0
15 May 2023
Improving Cascaded Unsupervised Speech Translation with Denoising Back-translation
Yu-Kuan Fu
Liang-Hsuan Tseng
Jiatong Shi
Chen-An Li
Tsung-Yuan Hsu
Shinji Watanabe
Hung-yi Lee
25
4
0
12 May 2023
Hybrid Transducer and Attention based Encoder-Decoder Modeling for Speech-to-Text Tasks
Yun Tang
Anna Y. Sun
Hirofumi Inaguma
Xinyue Chen
Ning Dong
Xutai Ma
Paden Tomasello
J. Pino
48
19
0
04 May 2023
DropDim: A Regularization Method for Transformer Networks
Hao Zhang
Dan Qu
Kejia Shao
Xu Yang
28
12
0
20 Apr 2023
Improving Speech Translation by Cross-Modal Multi-Grained Contrastive Learning
Hao Zhang
Nianwen Si
Yaqi Chen
Wenlin Zhang
Xukui Yang
Dan Qu
Weiqiang Zhang
35
9
0
20 Apr 2023
When Good and Reproducible Results are a Giant with Feet of Clay: The Importance of Software Quality in NLP
Sara Papi
Marco Gaido
Andrea Pilzer
Matteo Negri
59
10
0
28 Mar 2023
Transformers in Speech Processing: A Survey
S. Latif
Aun Zaidi
Heriberto Cuayáhuitl
Fahad Shamshad
Moazzam Shoukat
Junaid Qadir
46
47
0
21 Mar 2023
Pre-training for Speech Translation: CTC Meets Optimal Transport
Hang Le
Hongyu Gong
Changhan Wang
J. Pino
Benjamin Lecouteux
D. Schwab
OT
13
22
0
27 Jan 2023
SegAugment: Maximizing the Utility of Speech Translation Data with Segmentation-based Augmentations
Ioannis Tsiamas
José A. R. Fonollosa
Marta R. Costa-jussá
41
6
0
19 Dec 2022
Mu
2
^{2}
2
SLAM: Multitask, Multilingual Speech and Language Models
Yong Cheng
Yu Zhang
Melvin Johnson
Wolfgang Macherey
Ankur Bapna
33
8
0
19 Dec 2022
WACO: Word-Aligned Contrastive Learning for Speech Translation
Siqi Ouyang
Rong Ye
Lei Li
32
25
0
19 Dec 2022
UnitY: Two-pass Direct Speech-to-speech Translation with Discrete Units
Hirofumi Inaguma
Sravya Popuri
Ilia Kulikov
Peng-Jen Chen
Changhan Wang
Yu-An Chung
Yun Tang
Ann Lee
Shinji Watanabe
J. Pino
53
51
0
15 Dec 2022
1
2
3
4
5
Next