ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2305.18108
  4. Cited By
Exploration of Efficient End-to-End ASR using Discretized Input from
  Self-Supervised Learning

Exploration of Efficient End-to-End ASR using Discretized Input from Self-Supervised Learning

Interspeech (Interspeech), 2023
29 May 2023
Xuankai Chang
Brian Yan
Yuya Fujita
Takashi Maekaku
Shinji Watanabe
ArXiv (abs)PDFHTML

Papers citing "Exploration of Efficient End-to-End ASR using Discretized Input from Self-Supervised Learning"

14 / 14 papers shown
Benchmarking Training Paradigms, Dataset Composition, and Model Scaling for Child ASR in ESPnet
Benchmarking Training Paradigms, Dataset Composition, and Model Scaling for Child ASR in ESPnetWorkshop on Child, Computer and Interaction (CCI), 2025
Anyu Ying
Natarajan Balaji Shankar
Chyi-Jiunn Lin
Mohan Shi
Pu Wang
Hye-jin Shim
Siddhant Arora
Hugo Van hamme
Abeer Alwan
Shinji Watanabe
122
0
0
22 Aug 2025
Benchmarking Prosody Encoding in Discrete Speech Tokens
Benchmarking Prosody Encoding in Discrete Speech Tokens
Kentaro Onda
Satoru Fukayama
Daisuke Saito
Nobuaki Minematsu
94
1
0
15 Aug 2025
Discrete Speech Unit Extraction via Independent Component Analysis
Discrete Speech Unit Extraction via Independent Component Analysis
Tomohiko Nakamura
Kwanghee Choi
Keigo Hojo
Yoshiaki Bando
Satoru Fukayama
Shinji Watanabe
203
4
0
11 Jan 2025
Exploring SSL Discrete Speech Features for Zipformer-based Contextual ASR
Exploring SSL Discrete Speech Features for Zipformer-based Contextual ASR
Mingyu Cui
Yifan Yang
Jiajun Deng
Jiawen Kang
Shujie Hu
Tianzi Wang
Zhaoqing Li
Shiliang Zhang
Xie Chen
Xunying Liu
252
2
0
13 Sep 2024
DiscreteSLU: A Large Language Model with Self-Supervised Discrete Speech
  Units for Spoken Language Understanding
DiscreteSLU: A Large Language Model with Self-Supervised Discrete Speech Units for Spoken Language Understanding
Suwon Shon
Kwangyoun Kim
Yi-Te Hsu
Prashant Sridhar
Shinji Watanabe
Karen Livescu
AuLLM
296
9
0
13 Jun 2024
TokSing: Singing Voice Synthesis based on Discrete Tokens
TokSing: Singing Voice Synthesis based on Discrete Tokens
Yuning Wu
Chunlei Zhang
Jiatong Shi
Yuxun Tang
Shan Yang
Qin Jin
256
13
0
12 Jun 2024
The Interspeech 2024 Challenge on Speech Processing Using Discrete Units
The Interspeech 2024 Challenge on Speech Processing Using Discrete Units
Xuankai Chang
Jiatong Shi
Jinchuan Tian
Yuning Wu
Yuxun Tang
Yihan Wu
Shinji Watanabe
Yossi Adi
Xie Chen
Qin Jin
225
27
0
11 Jun 2024
Acoustic BPE for Speech Generation with Discrete Tokens
Acoustic BPE for Speech Generation with Discrete TokensIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Feiyu Shen
Yiwei Guo
Chenpeng Du
Xie Chen
Kai Yu
320
16
0
23 Oct 2023
Cross-Modal Multi-Tasking for Speech-to-Text Translation via Hard
  Parameter Sharing
Cross-Modal Multi-Tasking for Speech-to-Text Translation via Hard Parameter SharingIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
B. Grimstad
Xuankai Chang
Antonios Anastasopoulos
Yuya Fujita
Shinji Watanabe
282
5
0
27 Sep 2023
Unsupervised Accent Adaptation Through Masked Language Model Correction
  Of Discrete Self-Supervised Speech Units
Unsupervised Accent Adaptation Through Masked Language Model Correction Of Discrete Self-Supervised Speech UnitsIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Jakob Poncelet
Hugo Van hamme
152
3
0
25 Sep 2023
Towards Practical and Efficient Image-to-Speech Captioning with
  Vision-Language Pre-training and Multi-modal Tokens
Towards Practical and Efficient Image-to-Speech Captioning with Vision-Language Pre-training and Multi-modal TokensIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Minsu Kim
J. Choi
Soumi Maiti
Jeong Hun Yeo
Shinji Watanabe
Y. Ro
VLM
194
8
0
15 Sep 2023
Towards Universal Speech Discrete Tokens: A Case Study for ASR and TTS
Towards Universal Speech Discrete Tokens: A Case Study for ASR and TTSIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Yifan Yang
Feiyu Shen
Chenpeng Du
Ziyang Ma
K. Yu
Daniel Povey
Xie Chen
214
40
0
14 Sep 2023
Lip Reading for Low-resource Languages by Learning and Combining General
  Speech Knowledge and Language-specific Knowledge
Lip Reading for Low-resource Languages by Learning and Combining General Speech Knowledge and Language-specific KnowledgeIEEE International Conference on Computer Vision (ICCV), 2023
Minsu Kim
Jeong Hun Yeo
J. Choi
Y. Ro
209
27
0
18 Aug 2023
AKVSR: Audio Knowledge Empowered Visual Speech Recognition by
  Compressing Audio Knowledge of a Pretrained Model
AKVSR: Audio Knowledge Empowered Visual Speech Recognition by Compressing Audio Knowledge of a Pretrained ModelIEEE transactions on multimedia (IEEE TMM), 2023
Jeong Hun Yeo
Minsu Kim
J. Choi
Dae Hoe Kim
Y. Ro
187
26
0
15 Aug 2023
1