Exploration of Efficient End-to-End ASR using Discretized Input from Self-Supervised Learning

Interspeech (Interspeech), 2023

29 May 2023

Shinji Watanabe

Papers citing "Exploration of Efficient End-to-End ASR using Discretized Input from Self-Supervised Learning"

14 / 14 papers shown

Benchmarking Training Paradigms, Dataset Composition, and Model Scaling for Child ASR in ESPnetWorkshop on Child, Computer and Interaction (CCI), 2025

Anyu Ying

Natarajan Balaji Shankar

122

22 Aug 2025

Benchmarking Prosody Encoding in Discrete Speech Tokens

15 Aug 2025

Discrete Speech Unit Extraction via Independent Component Analysis

203

11 Jan 2025

Exploring SSL Discrete Speech Features for Zipformer-based Contextual ASR

252

13 Sep 2024

DiscreteSLU: A Large Language Model with Self-Supervised Discrete Speech Units for Spoken Language Understanding

Shinji Watanabe

296

13 Jun 2024

TokSing: Singing Voice Synthesis based on Discrete Tokens

Jiatong Shi

256

12 Jun 2024

The Interspeech 2024 Challenge on Speech Processing Using Discrete Units

Xuankai Chang

Jiatong Shi

Jinchuan Tian

Yuning Wu

Yuxun Tang

Yihan Wu

Shinji Watanabe

Yossi Adi

Xie Chen

Qin Jin

225

11 Jun 2024

Acoustic BPE for Speech Generation with Discrete TokensIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

Feiyu Shen

Yiwei Guo

Chenpeng Du

Xie Chen

Kai Yu

320

23 Oct 2023

Cross-Modal Multi-Tasking for Speech-to-Text Translation via Hard Parameter SharingIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

B. Grimstad

Xuankai Chang

Antonios Anastasopoulos

Yuya Fujita

Shinji Watanabe

282

27 Sep 2023

Unsupervised Accent Adaptation Through Masked Language Model Correction Of Discrete Self-Supervised Speech UnitsIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

Jakob Poncelet

Hugo Van hamme

152

25 Sep 2023

Towards Practical and Efficient Image-to-Speech Captioning with Vision-Language Pre-training and Multi-modal TokensIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

Jeong Hun Yeo

194

15 Sep 2023

Towards Universal Speech Discrete Tokens: A Case Study for ASR and TTSIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

Yifan Yang

Chenpeng Du

Xie Chen

214

14 Sep 2023

Lip Reading for Low-resource Languages by Learning and Combining General Speech Knowledge and Language-specific KnowledgeIEEE International Conference on Computer Vision (ICCV), 2023

Minsu Kim

Jeong Hun Yeo

J. Choi

Y. Ro

209

18 Aug 2023

AKVSR: Audio Knowledge Empowered Visual Speech Recognition by Compressing Audio Knowledge of a Pretrained ModelIEEE transactions on multimedia (IEEE TMM), 2023

Jeong Hun Yeo

187

15 Aug 2023