Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2201.03713
Cited By
v1
v2
v3 (latest)
CVSS Corpus and Massively Multilingual Speech-to-Speech Translation
International Conference on Language Resources and Evaluation (LREC), 2022
11 January 2022
Yeting Jia
Michelle Tadmor Ramanovich
Quan Wang
Heiga Zen
SLR
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"CVSS Corpus and Massively Multilingual Speech-to-Speech Translation"
50 / 54 papers shown
Improving Direct Persian-English Speech-to-Speech Translation with Discrete Units and Synthetic Parallel Data
Physical Review X (PRX), 2025
Sina Rashidi
Hossein Sameti
114
0
0
16 Nov 2025
MTP-S2UT: Enhancing Speech-to-Speech Translation Quality with Multi-token Prediction
Jianjin Wang
Runsong Zhao
Xiaoqian Liu
Yuan Ge
Ziqiang Xu
Tong Xiao
Shengxiang Gao
Z. Yu
Jingbo Zhu
144
0
0
11 Oct 2025
UniSS: Unified Expressive Speech-to-Speech Translation with Your Voice
Sitong Cheng
Weizhen Bian
Xinsheng Wang
Ruibin Yuan
Jianyi Chen
Shunshun Yin
Wenhan Luo
Wei Xue
193
0
0
25 Sep 2025
MLLM-based Speech Recognition: When and How is Multimodality Beneficial?
Yiwen Guan
V. Trinh
Vivek Voleti
Jacob Whitehill
274
1
0
25 Jul 2025
Step-Audio 2 Technical Report
Boyong Wu
Chao Yan
Chen Hu
Cheng Yi
Chengli Feng
...
Yuanwei Lu
Yuchu Luo
Yuhe Yin
Yumeng Zhan
Y. Zhang
AuLLM
352
0
0
22 Jul 2025
Scheduled Interleaved Speech-Text Training for Speech-to-Speech Translation with LLMs
Hayato Futami
E. Tsunoo
Yosuke Kashiwagi
Yuki Ito
Hassan Shahmohammadi
Siddhant Arora
Shinji Watanabe
AuLLM
287
1
0
12 Jun 2025
Miipher-2: A Universal Speech Restoration Model for Million-Hour Scale Data Restoration
Shigeki Karita
Yuma Koizumi
Heiga Zen
Haruko Ishikawa
Robin Scheibler
M. Bacchiani
VLM
1.1K
5
0
07 May 2025
Universal Speech Token Learning via Low-Bitrate Neural Codec and Pretrained Representations
IEEE Journal on Selected Topics in Signal Processing (JSTSP), 2024
Xue Jiang
Xiulian Peng
Yuan Zhang
Yan Lu
SSL
415
6
0
15 Mar 2025
Audio-FLAN: A Preliminary Release
Liumeng Xue
Ziya Zhou
J. Pan
Zhiyu Li
Shuai Fan
...
Haohe Liu
Emmanouil Benetos
Ge Zhang
Wenhan Luo
Wei Xue
MLLM
AuLLM
CLIP
VLM
325
2
0
23 Feb 2025
High-Fidelity Simultaneous Speech-To-Speech Translation
Tom Labiausse
Laurent Mazaré
Edouard Grave
P. Pérez
Alexandre Défossez
Neil Zeghidour
1.1K
19
0
05 Feb 2025
Recent Advances in Speech Language Models: A Survey
Annual Meeting of the Association for Computational Linguistics (ACL), 2024
Wenqian Cui
Dianzhi Yu
Xiaoqi Jiao
Ziqiao Meng
Guangyan Zhang
Qichao Wang
Yiwen Guo
Irwin King
AuLLM
717
89
0
01 Oct 2024
Multi-modal Speech Transformer Decoders: When Do Multiple Modalities Improve Accuracy?
Yiwen Guan
V. Trinh
Vivek Voleti
Jacob Whitehill
348
2
0
13 Sep 2024
Analyzing Speech Unit Selection for Textless Speech-to-Speech Translation
J. Duret
Yannick Esteve
Titouan Parcollet
241
0
0
08 Jul 2024
MSR-86K: An Evolving, Multilingual Corpus with 86,300 Hours of Transcribed Audio for Speech Recognition Research
Song Li
Yongbin You
Xuezhi Wang
Zhengkun Tian
Ke Ding
Guanglu Wan
250
12
0
26 Jun 2024
Diffusion Synthesizer for Efficient Multilingual Speech to Speech Translation
Interspeech (Interspeech), 2024
Nameer Hirschkind
Xiao Yu
Xiao Yu
Joseph Liu
Eloi DuBois
...
Colin Sinclair
Kyle Spence
Charles Shang
Zoë Abrams
Morgan McGuire
189
1
0
14 Jun 2024
TransVIP: Speech to Speech Translation System with Voice and Isochrony Preservation
Chenyang Le
Yao Qian
Dongmei Wang
Long Zhou
Shujie Liu
...
Midia Yousefi
Yanmin Qian
Jinyu Li
Sheng Zhao
Michael Zeng
376
15
0
28 May 2024
CrossVoice: Crosslingual Prosody Preserving Cascade-S2ST using Transfer Learning
Medha Hira
Arnav Goel
Anubha Gupta
298
2
0
23 May 2024
DiffNorm: Self-Supervised Normalization for Non-autoregressive Speech-to-speech Translation
Weiting Tan
Jingyu Zhang
Lingfeng Shen
Daniel Khashabi
Philipp Koehn
286
1
0
22 May 2024
FFSTC: Fongbe to French Speech Translation Corpus
International Conference on Language Resources and Evaluation (LREC), 2024
D. F. Kponou
F. Laleye
E. C. Ezin
243
3
0
08 Mar 2024
Direct Punjabi to English speech translation using discrete units
Prabhjot Kaur
L. A. M. Bush
Weisong Shi
253
2
0
25 Feb 2024
TranSentence: Speech-to-speech Translation via Language-agnostic Sentence-level Speech Encoding without Language-parallel Data
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Seung-Bin Kim
Sang-Hoon Lee
Seong-Whan Lee
214
6
0
17 Jan 2024
DASpeech: Directed Acyclic Transformer for Fast and High-quality Speech-to-Speech Translation
Neural Information Processing Systems (NeurIPS), 2023
Qingkai Fang
Yan Zhou
Yangzhou Feng
254
17
0
11 Oct 2023
Speech-to-Speech Translation with Discrete-Unit-Based Style Transfer
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Yongqiang Wang
Jionghao Bai
Rongjie Huang
Ruiqi Li
Zhiqing Hong
Zhou Zhao
202
9
0
14 Sep 2023
Direct Text to Speech Translation System using Acoustic Units
IEEE Signal Processing Letters (IEEE SPL), 2023
Victoria Mingote
Pablo Gimeno
Luis Vicente
Sameer Khurana
Antoine Laurent
J. Duret
184
7
0
14 Sep 2023
Sparks of Large Audio Models: A Survey and Outlook
S. Latif
Moazzam Shoukat
Fahad Shamshad
Muhammad Usama
Yi Ren
...
Wenwu Wang
Xulong Zhang
Roberto Togneri
Xiaoshi Zhong
Björn W. Schuller
LM&MA
AuLLM
821
56
0
24 Aug 2023
Many-to-Many Spoken Language Translation via Unified Speech and Text Representation Learning with Unit-to-Unit Translation
IEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023
Minsu Kim
J. Choi
Dahun Kim
Y. Ro
265
10
0
03 Aug 2023
Multilingual Speech-to-Speech Translation into Multiple Target Languages
Hongyu Gong
Ning Dong
Sravya Popuri
Vedanuj Goswami
Ann Lee
J. Pino
236
5
0
17 Jul 2023
Towards cross-language prosody transfer for dialog
Interspeech (Interspeech), 2023
Jonathan Avila
Nigel G. Ward
317
7
0
09 Jul 2023
AudioPaLM: A Large Language Model That Can Speak and Listen
Paul Kishan Rubenstein
Chulayuth Asawaroengchai
D. Nguyen
Ankur Bapna
Zalan Borsos
...
Neil Zeghidour
Yu Zhang
Zhishuai Zhang
Lukás Zilka
Christian Frank
LM&MA
AuLLM
VLM
418
425
0
22 Jun 2023
HK-LegiCoST: Leveraging Non-Verbatim Transcripts for Speech Translation
Interspeech (Interspeech), 2023
Cihan Xiao
Lin Zhang
Jinyi Yang
Dongji Gao
Sanjeev Khudanpur
Kevin Duh
Sanjeev Khudanpur
264
3
0
20 Jun 2023
PolyVoice: Language Models for Speech to Speech Translation
International Conference on Learning Representations (ICLR), 2023
Qianqian Dong
Zhiying Huang
Qiao Tian
Chen Xu
Tom Ko
...
Lu Lu
Zejun Ma
Yuping Wang
Mingxuan Wang
Yuxuan Wang
339
30
0
05 Jun 2023
Translatotron 3: Speech to Speech Translation with Monolingual Data
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Eliya Nachmani
Alon Levkovitch
Yi-Yang Ding
Chulayutsh Asawaroengchai
Heiga Zen
Michelle Tadmor Ramanovich
382
25
0
27 May 2023
AV-TranSpeech: Audio-Visual Robust Speech-to-Speech Translation
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Rongjie Huang
Huadai Liu
Xize Cheng
Yi Ren
Lin Li
...
Jinzheng He
Lichao Zhang
Jinglin Liu
Xiaoyue Yin
Zhou Zhao
257
10
0
24 May 2023
i-Code Studio: A Configurable and Composable Framework for Integrative AI
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Yuwei Fang
Mahmoud Khademi
Chenguang Zhu
Ziyi Yang
Reid Pryzant
...
Yao Qian
Takuya Yoshioka
Lu Yuan
Michael Zeng
Xuedong Huang
238
2
0
23 May 2023
Duplex Diffusion Models Improve Speech-to-Speech Translation
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Xianchao Wu
DiffM
263
6
0
22 May 2023
MParrotTTS: Multilingual Multi-speaker Text to Speech Synthesis in Low Resource Setting
Neil Shah
Vishal Tambrahalli
Saiteja Kosgi
N. Pedanekar
Vineet Gandhi
183
1
0
19 May 2023
Improving Cascaded Unsupervised Speech Translation with Denoising Back-translation
Yu-Kuan Fu
Liang-Hsuan Tseng
Jiatong Shi
Chen-An Li
Tsung-Yuan Hsu
Shinji Watanabe
Hung-yi Lee
168
6
0
12 May 2023
Enhancing Speech-to-Speech Translation with Multiple TTS Targets
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Jiatong Shi
Yun Tang
Ann Lee
Hirofumi Inaguma
Changhan Wang
J. Pino
Shinji Watanabe
183
11
0
10 Apr 2023
ESPnet-ST-v2: Multipurpose Spoken Language Translation Toolkit
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Brian Yan
Jiatong Shi
Yun Tang
Hirofumi Inaguma
Yifan Peng
...
Zhaoheng Ni
Moto Hira
Soumi Maiti
J. Pino
Shinji Watanabe
278
23
0
10 Apr 2023
Speak Foreign Languages with Your Own Voice: Cross-Lingual Neural Codec Language Modeling
Zi-Hua Zhang
Long Zhou
Chengyi Wang
Sanyuan Chen
Yu Wu
...
Huaming Wang
Jinyu Li
Lei He
Sheng Zhao
Furu Wei
VLM
426
252
0
07 Mar 2023
NusaCrowd: Open Source Initiative for Indonesian NLP Resources
Annual Meeting of the Association for Computational Linguistics (ACL), 2022
Samuel Cahyawijaya
Holy Lovenia
Alham Fikri Aji
Genta Indra Winata
Bryan Wilie
...
Timothy Baldwin
Sebastian Ruder
Herry Sujaini
S. Sakti
Ayu Purwarianti
536
71
0
19 Dec 2022
UnitY: Two-pass Direct Speech-to-speech Translation with Discrete Units
Annual Meeting of the Association for Computational Linguistics (ACL), 2022
Hirofumi Inaguma
Sravya Popuri
Ilia Kulikov
Peng-Jen Chen
Changhan Wang
Yu-An Chung
Yun Tang
Ann Lee
Shinji Watanabe
J. Pino
397
82
0
15 Dec 2022
VideoDubber: Machine Translation with Speech-Aware Length Control for Video Dubbing
AAAI Conference on Artificial Intelligence (AAAI), 2022
Yihan Wu
Junliang Guo
Xuejiao Tan
Chen Zhang
Bohan Li
Ruihua Song
Lei He
Sheng Zhao
Arul Menezes
Jiang Bian
190
32
0
30 Nov 2022
Dialogs Re-enacted Across Languages
Nigel G. Ward
Jonathan Avila
Emilia Rivas
Divette Marco
246
2
0
18 Nov 2022
Speech-to-Speech Translation For A Real-world Unwritten Language
Annual Meeting of the Association for Computational Linguistics (ACL), 2022
Peng-Jen Chen
Ke M. Tran
Yilin Yang
Jingfei Du
Justine T. Kao
...
Sravya Popuri
Changhan Wang
J. Pino
Wei-Ning Hsu
Ann Lee
378
38
0
11 Nov 2022
SpeechMatrix: A Large-Scale Mined Corpus of Multilingual Speech-to-Speech Translations
Annual Meeting of the Association for Computational Linguistics (ACL), 2022
Paul-Ambroise Duquenne
Hongyu Gong
Ning Dong
Jingfei Du
Ann Lee
Vedanuj Goswani
Changhan Wang
J. Pino
Benoît Sagot
Holger Schwenk
297
44
0
08 Nov 2022
Textless Direct Speech-to-Speech Translation with Discrete Speech Representation
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Xinjian Li
Ye Jia
Chung-Cheng Chiu
323
33
0
31 Oct 2022
Joint Pre-Training with Speech and Bilingual Text for Direct Speech to Speech Translation
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Kun Wei
Long Zhou
Zi-Hua Zhang
Liping Chen
Shujie Liu
Lei He
Jinyu Li
Furu Wei
214
17
0
31 Oct 2022
A Textless Metric for Speech-to-Speech Comparison
Laurent Besacier
S. Ribeiro
Olivier Galibert
Ioan Calapodescu
326
5
0
21 Oct 2022
Simple and Effective Unsupervised Speech Translation
Annual Meeting of the Association for Computational Linguistics (ACL), 2022
Changhan Wang
Hirofumi Inaguma
Peng-Jen Chen
Ilia Kulikov
Yun Tang
Wei-Ning Hsu
Michael Auli
J. Pino
SSL
275
20
0
18 Oct 2022
1
2
Next
Page 1 of 2