ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.22069
5
0

Delayed-KD: Delayed Knowledge Distillation based CTC for Low-Latency Streaming ASR

28 May 2025
Longhao Li
Yangze Li
Hongfei Xue
Jie Liu
Shuai Fang
Kai Wang
Lei Xie
ArXivPDFHTML
Abstract

CTC-based streaming ASR has gained significant attention in real-world applications but faces two main challenges: accuracy degradation in small chunks and token emission latency. To mitigate these challenges, we propose Delayed-KD, which applies delayed knowledge distillation on CTC posterior probabilities from a non-streaming to a streaming model. Specifically, with a tiny chunk size, we introduce a Temporal Alignment Buffer (TAB) that defines a relative delay range compared to the non-streaming teacher model to align CTC outputs and mitigate non-blank token mismatches. Additionally, TAB enables fine-grained control over token emission delay. Experiments on 178-hour AISHELL-1 and 10,000-hour WenetSpeech Mandarin datasets show consistent superiority of Delayed-KD. Impressively, Delayed-KD at 40 ms latency achieves a lower character error rate (CER) of 5.42% on AISHELL-1, comparable to the competitive U2++ model running at 320 ms latency.

View on arXiv
@article{li2025_2505.22069,
  title={ Delayed-KD: Delayed Knowledge Distillation based CTC for Low-Latency Streaming ASR },
  author={ Longhao Li and Yangze Li and Hongfei Xue and Jie Liu and Shuai Fang and Kai Wang and Lei Xie },
  journal={arXiv preprint arXiv:2505.22069},
  year={ 2025 }
}
Comments on this paper