ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2102.03216
  4. Cited By
Intermediate Loss Regularization for CTC-based Speech Recognition

Intermediate Loss Regularization for CTC-based Speech Recognition

IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021
5 February 2021
Jaesong Lee
Shinji Watanabe
ArXiv (abs)PDFHTML

Papers citing "Intermediate Loss Regularization for CTC-based Speech Recognition"

50 / 92 papers shown
End-to-end Speech Recognition with similar length speech and text
End-to-end Speech Recognition with similar length speech and text
Peng Fan
Wenping Wang
Fei Deng
146
0
0
12 Oct 2025
UMA-Split: unimodal aggregation for both English and Mandarin non-autoregressive speech recognition
UMA-Split: unimodal aggregation for both English and Mandarin non-autoregressive speech recognition
Ying Fang
Xiaofei Li
199
0
0
18 Sep 2025
Geolocation-Aware Robust Spoken Language Identification
Geolocation-Aware Robust Spoken Language Identification
Qingzheng Wang
Hye-jin Shim
Jiancheng Sun
Shinji Watanabe
181
1
0
23 Aug 2025
Objective Soups: Multilingual Multi-Task Modeling for Speech Processing
Objective Soups: Multilingual Multi-Task Modeling for Speech Processing
A. F. M. Saif
Lisha Chen
Xiaodong Cui
Songtao Lu
Brian Kingsbury
Tianyi Chen
149
0
0
12 Aug 2025
WCTC-Biasing: Retraining-free Contextual Biasing ASR with Wildcard CTC-based Keyword Spotting and Inter-layer Biasing
WCTC-Biasing: Retraining-free Contextual Biasing ASR with Wildcard CTC-based Keyword Spotting and Inter-layer Biasing
Yu Nakagome
Michael Hentschel
283
0
0
02 Jun 2025
Improving Multilingual Speech Models on ML-SUPERB 2.0: Fine-tuning with Data Augmentation and LID-Aware CTC
Improving Multilingual Speech Models on ML-SUPERB 2.0: Fine-tuning with Data Augmentation and LID-Aware CTC
Qingzheng Wang
Jiancheng Sun
Yifan Peng
Shinji Watanabe
374
1
0
30 May 2025
Complexity boosted adaptive training for better low resource ASR
  performance
Complexity boosted adaptive training for better low resource ASR performance
Hongxuan Lu
Shenjian Wang
Biao Li
301
0
0
01 Dec 2024
Enhancing Code-Switching ASR Leveraging Non-Peaky CTC Loss and Deep
  Language Posterior Injection
Enhancing Code-Switching ASR Leveraging Non-Peaky CTC Loss and Deep Language Posterior InjectionSpoken Language Technology Workshop (SLT), 2024
Tzu-Ting Yang
Hsin-Wei Wang
Yi-Cheng Wang
Berlin Chen
395
0
0
26 Nov 2024
Transducer Consistency Regularization for Speech to Text Applications
Transducer Consistency Regularization for Speech to Text ApplicationsSpoken Language Technology Workshop (SLT), 2024
Cindy Tseng
Yun Tang
Vijendra Raj Apsingekar
361
0
0
09 Oct 2024
DCIM-AVSR : Efficient Audio-Visual Speech Recognition via Dual Conformer Interaction Module
DCIM-AVSR : Efficient Audio-Visual Speech Recognition via Dual Conformer Interaction ModuleIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Xinyu Wang
Qian Wang
Haolin Huang
Yu Fang
Mengjie Xu
Qian Wang
521
2
0
31 Aug 2024
Robust ASR Error Correction with Conservative Data Filtering
Robust ASR Error Correction with Conservative Data Filtering
Takuma Udagawa
Masayuki Suzuki
Masayasu Muraoka
Gakuto Kurata
386
5
0
18 Jul 2024
Dynamic Encoder Size Based on Data-Driven Layer-wise Pruning for Speech
  Recognition
Dynamic Encoder Size Based on Data-Driven Layer-wise Pruning for Speech Recognition
Jingjing Xu
Wei Zhou
Zijian Yang
Eugen Beck
Ralf Schlueter
377
6
0
10 Jul 2024
Tailored Design of Audio-Visual Speech Recognition Models using Branchformers
Tailored Design of Audio-Visual Speech Recognition Models using Branchformers
David Gimeno-Gómez
Carlos David Martínez Hinarejos
520
6
0
09 Jul 2024
Contextualized End-to-end Automatic Speech Recognition with Intermediate
  Biasing Loss
Contextualized End-to-end Automatic Speech Recognition with Intermediate Biasing Loss
Muhammad Shakeel
Yui Sudo
Yifan Peng
Shinji Watanabe
AI4CE
323
11
0
23 Jun 2024
Revisiting Interpolation Augmentation for Speech-to-Text Generation
Revisiting Interpolation Augmentation for Speech-to-Text Generation
Chen Xu
Jie Wang
Xiaoqian Liu
Qianqian Dong
Chunliang Zhang
Tong Xiao
Jingbo Zhu
Dapeng Man
Wu Yang
220
1
0
22 Jun 2024
InterBiasing: Boost Unseen Word Recognition through Biasing Intermediate
  Predictions
InterBiasing: Boost Unseen Word Recognition through Biasing Intermediate Predictions
Yu Nakagome
Michael Hentschel
252
5
0
21 Jun 2024
Rapid Language Adaptation for Multilingual E2E Speech Recognition Using
  Encoder Prompting
Rapid Language Adaptation for Multilingual E2E Speech Recognition Using Encoder PromptingInterspeech (Interspeech), 2024
Yosuke Kashiwagi
Hayato Futami
E. Tsunoo
Siddhant Arora
Shinji Watanabe
234
5
0
18 Jun 2024
Lightweight Audio Segmentation for Long-form Speech Translation
Lightweight Audio Segmentation for Long-form Speech TranslationInterspeech (Interspeech), 2024
Jaesong Lee
Soyoon Kim
Hanbyul Kim
Joon Son Chung
224
2
0
15 Jun 2024
CNVSRC 2023: The First Chinese Continuous Visual Speech Recognition
  Challenge
CNVSRC 2023: The First Chinese Continuous Visual Speech Recognition ChallengeInterspeech (Interspeech), 2024
Chen Chen
Zehua Liu
Xiaolou Li
Lantian Li
D. Wang
247
6
0
14 Jun 2024
Guiding Frame-Level CTC Alignments Using Self-knowledge Distillation
Guiding Frame-Level CTC Alignments Using Self-knowledge Distillation
Eungbeom Kim
Hantae Kim
Kyogu Lee
247
3
0
12 Jun 2024
Enhancing CTC-based speech recognition with diverse modeling units
Enhancing CTC-based speech recognition with diverse modeling units
Shiyi Han
Zhihong Lei
Mingbin Xu
Xingyu Na
Zhen Huang
425
1
0
05 Jun 2024
Low-resource speech recognition and dialect identification of Irish in a
  multi-task framework
Low-resource speech recognition and dialect identification of Irish in a multi-task frameworkThe Speaker and Language Recognition Workshop (Odyssey), 2024
Liam Lonergan
Mengjie Qian
Neasa Ní Chiaráin
Christer Gobl
A. N. Chasaide
293
6
0
02 May 2024
Multilingual Audio-Visual Speech Recognition with Hybrid CTC/RNN-T Fast
  Conformer
Multilingual Audio-Visual Speech Recognition with Hybrid CTC/RNN-T Fast ConformerIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Maxime Burchi
Krishna C. Puvvada
Jagadeesh Balam
Boris Ginsburg
Radu Timofte
281
19
0
14 Mar 2024
The evaluation of a code-switched Sepedi-English automatic speech
  recognition system
The evaluation of a code-switched Sepedi-English automatic speech recognition system
Amanda Phaladi
T. Modipa
209
0
0
11 Mar 2024
A Study of Dropout-Induced Modality Bias on Robustness to Missing Video
  Frames for Audio-Visual Speech Recognition
A Study of Dropout-Induced Modality Bias on Robustness to Missing Video Frames for Audio-Visual Speech Recognition
Yusheng Dai
Hang Chen
Jun Du
Ruoyu Wang
Shihao Chen
Jie Ma
Haotian Wang
Chin-Hui Lee
324
13
0
07 Mar 2024
Alternating Weak Triphone/BPE Alignment Supervision from Hybrid Model
  Improves End-to-End ASR
Alternating Weak Triphone/BPE Alignment Supervision from Hybrid Model Improves End-to-End ASR
Jintao Jiang
Yingbo Gao
Mohammad Zeineldeen
Zoltán Tüske
327
0
0
23 Feb 2024
OWSM-CTC: An Open Encoder-Only Speech Foundation Model for Speech
  Recognition, Translation, and Language Identification
OWSM-CTC: An Open Encoder-Only Speech Foundation Model for Speech Recognition, Translation, and Language Identification
Yifan Peng
Yui Sudo
Muhammad Shakeel
Shinji Watanabe
VLM
412
41
0
20 Feb 2024
Keep Decoding Parallel with Effective Knowledge Distillation from
  Language Models to End-to-end Speech Recognisers
Keep Decoding Parallel with Effective Knowledge Distillation from Language Models to End-to-end Speech RecognisersIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Michael Hentschel
Yuta Nishikawa
Tatsuya Komatsu
Yusuke Fujita
351
5
0
22 Jan 2024
MLCA-AVSR: Multi-Layer Cross Attention Fusion based Audio-Visual Speech
  Recognition
MLCA-AVSR: Multi-Layer Cross Attention Fusion based Audio-Visual Speech Recognition
He Wang
Pengcheng Guo
Pan Zhou
Lei Xie
455
25
0
07 Jan 2024
Soft Alignment of Modality Space for End-to-end Speech Translation
Soft Alignment of Modality Space for End-to-end Speech Translation
Yuhao Zhang
Kaiqi Kou
Bei Li
Chen Xu
Chunliang Zhang
Tong Xiao
Jingbo Zhu
401
9
0
18 Dec 2023
FastInject: Injecting Unpaired Text Data into CTC-based ASR training
FastInject: Injecting Unpaired Text Data into CTC-based ASR trainingIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Keqi Deng
Phil Woodland
283
3
0
14 Dec 2023
The GUA-Speech System Description for CNVSRC Challenge 2023
The GUA-Speech System Description for CNVSRC Challenge 2023
Shengqiang Li
Chao Lei
Baozhong Ma
Binbin Zhang
Fuping Pan
220
1
0
12 Dec 2023
Weak Alignment Supervision from Hybrid Model Improves End-to-end ASR
Weak Alignment Supervision from Hybrid Model Improves End-to-end ASR
Jintao Jiang
Yingbo Gao
Zoltán Tüske
441
1
0
24 Nov 2023
Retrieve and Copy: Scaling ASR Personalization to Large Catalogs
Retrieve and Copy: Scaling ASR Personalization to Large CatalogsConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Sai Muralidhar Jayanthi
Devang Kulshreshtha
Saket Dingliwal
S. Ronanki
S. Bodapati
262
9
0
14 Nov 2023
Key Frame Mechanism For Efficient Conformer Based End-to-end Speech
  Recognition
Key Frame Mechanism For Efficient Conformer Based End-to-end Speech RecognitionIEEE Signal Processing Letters (IEEE SPL), 2023
Peng Fan
Changhao Shan
Sining Sun
Qing Yang
Jianwei Zhang
297
4
0
23 Oct 2023
SA-Paraformer: Non-autoregressive End-to-End Speaker-Attributed ASR
SA-Paraformer: Non-autoregressive End-to-End Speaker-Attributed ASRAutomatic Speech Recognition & Understanding (ASRU), 2023
Yangze Li
Fan Yu
Yuhao Liang
Pengcheng Guo
Mohan Shi
Zhihao Du
Shiliang Zhang
Lei Xie
235
6
0
07 Oct 2023
SSHR: Leveraging Self-supervised Hierarchical Representations for
  Multilingual Automatic Speech Recognition
SSHR: Leveraging Self-supervised Hierarchical Representations for Multilingual Automatic Speech RecognitionIEEE International Conference on Multimedia and Expo (ICME), 2023
Hongfei Xue
Qijie Shao
Tommy Yuan
Peikun Chen
Jie Liu
Lei Xie
316
6
0
29 Sep 2023
Unsupervised Pre-Training for Vietnamese Automatic Speech Recognition in
  the HYKIST Project
Unsupervised Pre-Training for Vietnamese Automatic Speech Recognition in the HYKIST Project
Khai-Nguyen Nguyen
263
2
0
26 Sep 2023
Bridging the Gaps of Both Modality and Language: Synchronous Bilingual
  CTC for Speech Translation and Speech Recognition
Bridging the Gaps of Both Modality and Language: Synchronous Bilingual CTC for Speech Translation and Speech RecognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Chen Xu
Xiaoqian Liu
Erfeng He
Yuhao Zhang
Qianqian Dong
Tong Xiao
Jingbo Zhu
Dapeng Man
Wu Yang
228
1
0
21 Sep 2023
Semi-Autoregressive Streaming ASR With Label Context
Semi-Autoregressive Streaming ASR With Label ContextIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Siddhant Arora
G. Saon
Shinji Watanabe
Brian Kingsbury
AI4TS
321
12
0
19 Sep 2023
Unimodal Aggregation for CTC-based Speech Recognition
Unimodal Aggregation for CTC-based Speech RecognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Ying Fang
Xiaofei Li
295
4
0
15 Sep 2023
Improving CTC-AED model with integrated-CTC and auxiliary loss
  regularization
Improving CTC-AED model with integrated-CTC and auxiliary loss regularization
Daobin Zhu
Xiangdong Su
Hongbin Zhang
269
2
0
15 Aug 2023
Confidence-based Ensembles of End-to-End Speech Recognition Models
Confidence-based Ensembles of End-to-End Speech Recognition ModelsInterspeech (Interspeech), 2023
Igor Gitman
Vitaly Lavrukhin
A. Laptev
Boris Ginsburg
UQCV
435
9
0
27 Jun 2023
Research on an improved Conformer end-to-end Speech Recognition Model
  with R-Drop Structure
Research on an improved Conformer end-to-end Speech Recognition Model with R-Drop Structure
Weidong Ji
Shijie Zan
Guohui Zhou
Xu Wang
SyDa
235
1
0
14 Jun 2023
A New Benchmark of Aphasia Speech Recognition and Detection Based on
  E-Branchformer and Multi-task Learning
A New Benchmark of Aphasia Speech Recognition and Detection Based on E-Branchformer and Multi-task LearningInterspeech (Interspeech), 2023
Jiyang Tang
William Chen
Xuankai Chang
Shinji Watanabe
B. MacWhinney
205
14
0
19 May 2023
A Comparative Study on E-Branchformer vs Conformer in Speech
  Recognition, Translation, and Understanding Tasks
A Comparative Study on E-Branchformer vs Conformer in Speech Recognition, Translation, and Understanding TasksInterspeech (Interspeech), 2023
Yifan Peng
Kwangyoun Kim
Felix Wu
Brian Yan
Siddhant Arora
William Chen
Jiyang Tang
Suwon Shon
Prashant Sridhar
Shinji Watanabe
289
23
0
18 May 2023
Mask The Bias: Improving Domain-Adaptive Generalization of CTC-based ASR
  with Internal Language Model Estimation
Mask The Bias: Improving Domain-Adaptive Generalization of CTC-based ASR with Internal Language Model EstimationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Nilaksh Das
Monica Sunkara
S. Bodapati
Jason (Jinglun) Cai
Devang Kulshreshtha
Jeffrey J. Farris
Katrin Kirchhoff
218
4
0
05 May 2023
Joint Modelling of Spoken Language Understanding Tasks with Integrated
  Dialog History
Joint Modelling of Spoken Language Understanding Tasks with Integrated Dialog HistoryIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Siddhant Arora
Hayato Futami
E. Tsunoo
Brian Yan
Shinji Watanabe
337
4
0
01 May 2023
Non-autoregressive End-to-end Approaches for Joint Automatic Speech
  Recognition and Spoken Language Understanding
Non-autoregressive End-to-end Approaches for Joint Automatic Speech Recognition and Spoken Language UnderstandingSpoken Language Technology Workshop (SLT), 2023
Mohan Li
R. Doddipatla
258
8
0
21 Apr 2023
A CTC Alignment-based Non-autoregressive Transformer for End-to-end
  Automatic Speech Recognition
A CTC Alignment-based Non-autoregressive Transformer for End-to-end Automatic Speech RecognitionIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023
Ruchao Fan
Wei Chu
Peng Chang
Abeer Alwan
256
20
0
15 Apr 2023
12
Next
Page 1 of 2