ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2004.10234
  4. Cited By
ESPnet-ST: All-in-One Speech Translation Toolkit

ESPnet-ST: All-in-One Speech Translation Toolkit

21 April 2020
Hirofumi Inaguma
Shun Kiyono
Kevin Duh
Shigeki Karita
Nelson Yalta
Tomoki Hayashi
Shinji Watanabe
ArXivPDFHTML

Papers citing "ESPnet-ST: All-in-One Speech Translation Toolkit"

43 / 43 papers shown
Title
AdaST: Dynamically Adapting Encoder States in the Decoder for End-to-End Speech-to-Text Translation
AdaST: Dynamically Adapting Encoder States in the Decoder for End-to-End Speech-to-Text Translation
Wuwei Huang
Dexin Wang
Deyi Xiong
72
4
0
18 Mar 2025
CoSTA: Code-Switched Speech Translation using Aligned Speech-Text
  Interleaving
CoSTA: Code-Switched Speech Translation using Aligned Speech-Text Interleaving
Bhavani Shankar
P. Jyothi
Pushpak Bhattacharyya
48
1
0
16 Jun 2024
Label-Synchronous Neural Transducer for E2E Simultaneous Speech
  Translation
Label-Synchronous Neural Transducer for E2E Simultaneous Speech Translation
Keqi Deng
Philip C. Woodland
43
4
0
06 Jun 2024
Few-Shot Spoken Language Understanding via Joint Speech-Text Models
Few-Shot Spoken Language Understanding via Joint Speech-Text Models
Chung-Ming Chien
Mingjiamei Zhang
Ju-Chieh Chou
Karen Livescu
34
3
0
09 Oct 2023
Incremental Blockwise Beam Search for Simultaneous Speech Translation
  with Controllable Quality-Latency Tradeoff
Incremental Blockwise Beam Search for Simultaneous Speech Translation with Controllable Quality-Latency Tradeoff
Peter Polák
Brian Yan
Shinji Watanabe
A. Waibel
Ondrej Bojar
28
9
0
20 Sep 2023
DUB: Discrete Unit Back-translation for Speech Translation
DUB: Discrete Unit Back-translation for Speech Translation
Dong Zhang
Rong Ye
Tom Ko
Mingxuan Wang
Yaqian Zhou
21
23
0
19 May 2023
A Comparative Study on E-Branchformer vs Conformer in Speech
  Recognition, Translation, and Understanding Tasks
A Comparative Study on E-Branchformer vs Conformer in Speech Recognition, Translation, and Understanding Tasks
Yifan Peng
Kwangyoun Kim
Felix Wu
Brian Yan
Siddhant Arora
William Chen
Jiyang Tang
Suwon Shon
Prashant Sridhar
Shinji Watanabe
29
17
0
18 May 2023
DropDim: A Regularization Method for Transformer Networks
DropDim: A Regularization Method for Transformer Networks
Hao Zhang
Dan Qu
Kejia Shao
Xu Yang
28
12
0
20 Apr 2023
Improving Speech Translation by Cross-Modal Multi-Grained Contrastive
  Learning
Improving Speech Translation by Cross-Modal Multi-Grained Contrastive Learning
Hao Zhang
Nianwen Si
Yaqi Chen
Wenlin Zhang
Xukui Yang
Dan Qu
Weiqiang Zhang
35
9
0
20 Apr 2023
ESPnet-ST-v2: Multipurpose Spoken Language Translation Toolkit
ESPnet-ST-v2: Multipurpose Spoken Language Translation Toolkit
Brian Yan
Jiatong Shi
Yun Tang
Hirofumi Inaguma
Yifan Peng
...
Zhaoheng Ni
Moto Hira
Soumi Maiti
J. Pino
Shinji Watanabe
19
20
0
10 Apr 2023
Efficient CTC Regularization via Coarse Labels for End-to-End Speech
  Translation
Efficient CTC Regularization via Coarse Labels for End-to-End Speech Translation
Biao Zhang
Barry Haddow
Rico Sennrich
17
3
0
21 Feb 2023
Align, Write, Re-order: Explainable End-to-End Speech Translation via
  Operation Sequence Generation
Align, Write, Re-order: Explainable End-to-End Speech Translation via Operation Sequence Generation
Motoi Omachi
Brian Yan
Siddharth Dalmia
Yuya Fujita
Shinji Watanabe
LRM
25
3
0
11 Nov 2022
Bridging Speech and Textual Pre-trained Models with Unsupervised ASR
Bridging Speech and Textual Pre-trained Models with Unsupervised ASR
Jiatong Shi
Chan-Jan Hsu
Ho-Lam Chung
Dongji Gao
Leibny Paola García-Perera
Shinji Watanabe
Ann Lee
Hung-yi Lee
32
12
0
06 Nov 2022
CTC Alignments Improve Autoregressive Translation
CTC Alignments Improve Autoregressive Translation
Brian Yan
Siddharth Dalmia
Yosuke Higuchi
Graham Neubig
Florian Metze
A. Black
Shinji Watanabe
44
33
0
11 Oct 2022
On the Impact of Noises in Crowd-Sourced Data for Speech Translation
On the Impact of Noises in Crowd-Sourced Data for Speech Translation
Siqi Ouyang
Rong Ye
Lei Li
17
8
0
28 Jun 2022
PaddleSpeech: An Easy-to-Use All-in-One Speech Toolkit
PaddleSpeech: An Easy-to-Use All-in-One Speech Toolkit
Hui Zhang
Tian Yuan
Junkun Chen
Xintong Li
Renjie Zheng
...
Zeyu Chen
Xiaoguang Hu
Dianhai Yu
Yanjun Ma
Liang Huang
AuLLM
31
24
0
20 May 2022
Wav2Seq: Pre-training Speech-to-Text Encoder-Decoder Models Using Pseudo
  Languages
Wav2Seq: Pre-training Speech-to-Text Encoder-Decoder Models Using Pseudo Languages
Felix Wu
Kwangyoun Kim
Shinji Watanabe
Kyu Jeong Han
Ryan T. McDonald
Kilian Q. Weinberger
Yoav Artzi
SyDa
48
37
0
02 May 2022
GigaST: A 10,000-hour Pseudo Speech Translation Corpus
GigaST: A 10,000-hour Pseudo Speech Translation Corpus
Rong Ye
Chengqi Zhao
Tom Ko
Chutong Meng
Tao Wang
Mingxuan Wang
Jun Cao
9
23
0
08 Apr 2022
Does Simultaneous Speech Translation need Simultaneous Models?
Does Simultaneous Speech Translation need Simultaneous Models?
Sara Papi
Marco Gaido
Matteo Negri
Marco Turchi
41
26
0
08 Apr 2022
Speech Segmentation Optimization using Segmented Bilingual Speech Corpus
  for End-to-end Speech Translation
Speech Segmentation Optimization using Segmented Bilingual Speech Corpus for End-to-end Speech Translation
Ryo Fukuda
Katsuhito Sudoh
Satoshi Nakamura
10
7
0
29 Mar 2022
STEMM: Self-learning with Speech-text Manifold Mixup for Speech
  Translation
STEMM: Self-learning with Speech-text Manifold Mixup for Speech Translation
Qingkai Fang
Rong Ye
Lei Li
Yang Feng
Mingxuan Wang
29
95
0
20 Mar 2022
ESPnet-SLU: Advancing Spoken Language Understanding through ESPnet
ESPnet-SLU: Advancing Spoken Language Understanding through ESPnet
Siddhant Arora
Siddharth Dalmia
Pavel Denisov
Xuankai Chang
Yushi Ueda
...
Karthik Ganesan
Brian Yan
Ngoc Thang Vu
A. Black
Shinji Watanabe
VLM
33
74
0
29 Nov 2021
Attention-based Multi-hypothesis Fusion for Speech Summarization
Attention-based Multi-hypothesis Fusion for Speech Summarization
Takatomo Kano
A. Ogawa
Marc Delcroix
Shinji Watanabe
22
13
0
16 Nov 2021
WaveFake: A Data Set to Facilitate Audio Deepfake Detection
WaveFake: A Data Set to Facilitate Audio Deepfake Detection
Joel Frank
Lea Schonherr
DiffM
129
123
0
04 Nov 2021
An Exploration of Self-Supervised Pretrained Representations for
  End-to-End Speech Recognition
An Exploration of Self-Supervised Pretrained Representations for End-to-End Speech Recognition
Xuankai Chang
Takashi Maekaku
Pengcheng Guo
Jing Shi
Yen-Ju Lu
...
Tianzi Wang
Shu-Wen Yang
Yu Tsao
Hung-yi Lee
Shinji Watanabe
SSL
AI4TS
24
81
0
09 Oct 2021
SpliceOut: A Simple and Efficient Audio Augmentation Method
SpliceOut: A Simple and Efficient Audio Augmentation Method
Arjit Jain
Pranay Reddy Samala
Deepak Mittal
P. Jyothi
M. Singh
28
10
0
30 Sep 2021
Fast-MD: Fast Multi-Decoder End-to-End Speech Translation with
  Non-Autoregressive Hidden Intermediates
Fast-MD: Fast Multi-Decoder End-to-End Speech Translation with Non-Autoregressive Hidden Intermediates
Hirofumi Inaguma
Siddharth Dalmia
Brian Yan
Shinji Watanabe
65
11
0
27 Sep 2021
ESPnet-ST IWSLT 2021 Offline Speech Translation System
ESPnet-ST IWSLT 2021 Offline Speech Translation System
Hirofumi Inaguma
Shun Kiyono
Nelson Enrique Yalta Soplin
Pengcheng Guo
Jun Suzuki
Kevin Duh
Shinji Watanabe
3DV
35
2
0
01 Jul 2021
Stacked Acoustic-and-Textual Encoding: Integrating the Pre-trained
  Models into Speech Translation Encoders
Stacked Acoustic-and-Textual Encoding: Integrating the Pre-trained Models into Speech Translation Encoders
Chen Xu
Bojie Hu
Yanyang Li
Yuhao Zhang
Shen Huang
Qi Ju
Tong Xiao
Jingbo Zhu
22
75
0
12 May 2021
Learning Shared Semantic Space for Speech-to-Text Translation
Learning Shared Semantic Space for Speech-to-Text Translation
Chi Han
Mingxuan Wang
Heng Ji
Lei Li
18
76
0
07 May 2021
Searchable Hidden Intermediates for End-to-End Models of Decomposable
  Sequence Tasks
Searchable Hidden Intermediates for End-to-End Models of Decomposable Sequence Tasks
Siddharth Dalmia
Brian Yan
Vikas Raunak
Florian Metze
Shinji Watanabe
39
30
0
02 May 2021
AlloST: Low-resource Speech Translation without Source Transcription
AlloST: Low-resource Speech Translation without Source Transcription
Yao-Fei Cheng
Hung-Shin Lee
Hsin-Min Wang
19
8
0
01 May 2021
End-to-end Speech Translation via Cross-modal Progressive Training
End-to-end Speech Translation via Cross-modal Progressive Training
Rong Ye
Mingxuan Wang
Lei Li
28
71
0
21 Apr 2021
NeurST: Neural Speech Translation Toolkit
NeurST: Neural Speech Translation Toolkit
Chengqi Zhao
Mingxuan Wang
Qianqian Dong
Rong Ye
Lei Li
30
32
0
18 Dec 2020
Dual-decoder Transformer for Joint Automatic Speech Recognition and
  Multilingual Speech Translation
Dual-decoder Transformer for Joint Automatic Speech Recognition and Multilingual Speech Translation
Hang Le
J. Pino
Changhan Wang
Jiatao Gu
D. Schwab
Laurent Besacier
39
82
0
02 Nov 2020
Recent Developments on ESPnet Toolkit Boosted by Conformer
Recent Developments on ESPnet Toolkit Boosted by Conformer
Pengcheng Guo
Florian Boyer
Xuankai Chang
Tomoki Hayashi
Yosuke Higuchi
...
Jing Shi
Shinji Watanabe
Kun Wei
Wangyou Zhang
Yuekai Zhang
45
262
0
26 Oct 2020
A Technical Report: BUT Speech Translation Systems
A Technical Report: BUT Speech Translation Systems
Hari Krishna Vydana
L. Burget
J. Černocký
24
0
0
22 Oct 2020
A General Multi-Task Learning Framework to Leverage Text Data for Speech
  to Text Tasks
A General Multi-Task Learning Framework to Leverage Text Data for Speech to Text Tasks
Yun Tang
J. Pino
Changhan Wang
Xutai Ma
Dmitriy Genzel
26
73
0
21 Oct 2020
Transformer based unsupervised pre-training for acoustic representation
  learning
Transformer based unsupervised pre-training for acoustic representation learning
Ruixiong Zhang
Haiwei Wu
Wubo Li
Dongwei Jiang
Wei Zou
Xiangang Li
SSL
ViT
27
27
0
29 Jul 2020
NeMo: a toolkit for building AI applications using Neural Modules
NeMo: a toolkit for building AI applications using Neural Modules
Oleksii Kuchaiev
Jason Chun Lok Li
Huyen Nguyen
Oleksii Hrinchuk
Ryan Leary
...
Jack Cook
P. Castonguay
Mariya Popova
Jocelyn Huang
Jonathan M. Cohen
211
292
0
14 Sep 2019
Tied Multitask Learning for Neural Speech Translation
Tied Multitask Learning for Neural Speech Translation
Antonios Anastasopoulos
David Chiang
100
172
0
19 Feb 2018
End-to-End Automatic Speech Translation of Audiobooks
End-to-End Automatic Speech Translation of Audiobooks
Alexandre Berard
Laurent Besacier
A. Kocabiyikoglu
Olivier Pietquin
75
190
0
12 Feb 2018
OpenNMT: Open-Source Toolkit for Neural Machine Translation
OpenNMT: Open-Source Toolkit for Neural Machine Translation
Guillaume Klein
Yoon Kim
Yuntian Deng
Jean Senellart
Alexander M. Rush
271
1,896
0
10 Jan 2017
1