ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2004.05009
  4. Cited By
Minimum Latency Training Strategies for Streaming Sequence-to-Sequence
  ASR
v1v2 (latest)

Minimum Latency Training Strategies for Streaming Sequence-to-Sequence ASR

IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2020
10 April 2020
Hirofumi Inaguma
Yashesh Gaur
Liang Lu
Jinyu Li
Jiawei Liu
    AI4TS
ArXiv (abs)PDFHTML

Papers citing "Minimum Latency Training Strategies for Streaming Sequence-to-Sequence ASR"

33 / 33 papers shown
Streaming Sequence Transduction through Dynamic Compression
Streaming Sequence Transduction through Dynamic Compression
Weiting Tan
Yunmo Chen
Tongfei Chen
Guanghui Qin
Haoran Xu
Heidi C. Zhang
Benjamin Van Durme
Philipp Koehn
620
2
0
02 Feb 2024
Unified Segment-to-Segment Framework for Simultaneous Sequence
  Generation
Unified Segment-to-Segment Framework for Simultaneous Sequence GenerationNeural Information Processing Systems (NeurIPS), 2023
Shaolei Zhang
Yang Feng
344
9
0
27 Oct 2023
CIF-T: A Novel CIF-based Transducer Architecture for Automatic Speech
  Recognition
CIF-T: A Novel CIF-based Transducer Architecture for Automatic Speech RecognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Tian-Hao Zhang
Dinghao Zhou
Guiping Zhong
Jiaming Zhou
Baoxiang Li
324
7
0
26 Jul 2023
Globally Normalising the Transducer for Streaming Speech Recognition
Globally Normalising the Transducer for Streaming Speech Recognition
Rogier van Dalen
203
0
0
20 Jul 2023
Self-regularised Minimum Latency Training for Streaming
  Transformer-based Speech Recognition
Self-regularised Minimum Latency Training for Streaming Transformer-based Speech RecognitionInterspeech (Interspeech), 2022
Mohan Li
R. Doddipatla
Catalin Zorila
342
0
0
24 Apr 2023
Minimum Latency Training of Sequence Transducers for Streaming
  End-to-End Speech Recognition
Minimum Latency Training of Sequence Transducers for Streaming End-to-End Speech RecognitionInterspeech (Interspeech), 2022
Yusuke Shinohara
Shinji Watanabe
AI4TS
275
11
0
04 Nov 2022
Delay-penalized transducer for low-latency streaming ASR
Delay-penalized transducer for low-latency streaming ASRIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Wei Kang
Zengwei Yao
Fangjun Kuang
Liyong Guo
Xiaoyu Yang
Long lin
Piotr Żelasko
Daniel Povey
321
13
0
31 Oct 2022
Large-Scale Streaming End-to-End Speech Translation with Neural
  Transducers
Large-Scale Streaming End-to-End Speech Translation with Neural TransducersInterspeech (Interspeech), 2022
Jian Xue
Peidong Wang
Jinyu Li
Matt Post
Yashesh Gaur
AI4TS
337
36
0
11 Apr 2022
Dynamic Latency for CTC-Based Streaming Automatic Speech Recognition
  With Emformer
Dynamic Latency for CTC-Based Streaming Automatic Speech Recognition With Emformer
J. Sun
Guiping Zhong
Dinghao Zhou
Baoxiang Li
234
0
0
29 Mar 2022
Transformer-based Streaming ASR with Cumulative Attention
Transformer-based Streaming ASR with Cumulative AttentionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Mohan Li
Shucong Zhang
Catalin Zorila
R. Doddipatla
274
12
0
11 Mar 2022
Run-and-back stitch search: novel block synchronous decoding for
  streaming encoder-decoder ASR
Run-and-back stitch search: novel block synchronous decoding for streaming encoder-decoder ASRIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
E. Tsunoo
Chaitanya Narisetty
Michael Hentschel
Yosuke Kashiwagi
Shinji Watanabe
209
3
0
25 Jan 2022
Building a great multi-lingual teacher with sparsely-gated mixture of
  experts for speech recognition
Building a great multi-lingual teacher with sparsely-gated mixture of experts for speech recognition
K. Kumatani
R. Gmyr
Andres Felipe Cruz Salinas
Linquan Liu
Wei Zuo
Devang Patel
Eric Sun
Yu Shi
MoE
369
22
0
10 Dec 2021
Recent Advances in End-to-End Automatic Speech Recognition
Recent Advances in End-to-End Automatic Speech RecognitionAPSIPA Transactions on Signal and Information Processing (TASIP), 2021
Jinyu Li
VLM
548
444
0
02 Nov 2021
An Investigation of Enhancing CTC Model for Triggered Attention-based
  Streaming ASR
An Investigation of Enhancing CTC Model for Triggered Attention-based Streaming ASRAsia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 2021
Huaibo Zhao
Yosuke Higuchi
Tetsuji Ogawa
Tetsunori Kobayashi
162
4
0
20 Oct 2021
VAD-free Streaming Hybrid CTC/Attention ASR for Unsegmented Recording
VAD-free Streaming Hybrid CTC/Attention ASR for Unsegmented RecordingInterspeech (Interspeech), 2021
Hirofumi Inaguma
Tatsuya Kawahara
311
2
0
15 Jul 2021
StableEmit: Selection Probability Discount for Reducing Emission Latency
  of Streaming Monotonic Attention ASR
StableEmit: Selection Probability Discount for Reducing Emission Latency of Streaming Monotonic Attention ASR
Hirofumi Inaguma
Tatsuya Kawahara
218
4
0
01 Jul 2021
Reducing Streaming ASR Model Delay with Self Alignment
Reducing Streaming ASR Model Delay with Self AlignmentInterspeech (Interspeech), 2021
Jaeyoung Kim
Han Lu
Anshuman Tripathi
Qian Zhang
Hasim Sak
166
21
0
06 May 2021
Dissecting User-Perceived Latency of On-Device E2E Speech Recognition
Dissecting User-Perceived Latency of On-Device E2E Speech RecognitionInterspeech (Interspeech), 2021
Yuan Shangguan
Rohit Prabhavalkar
Hang Su
Jay Mahadeokar
Yangyang Shi
...
Chunyang Wu
Duc Le
Ozlem Kalinli
Christian Fuegen
M. Seltzer
298
34
0
06 Apr 2021
Mutually-Constrained Monotonic Multihead Attention for Online ASR
Mutually-Constrained Monotonic Multihead Attention for Online ASRIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021
Jae-gyun Song
Hajin Shim
Eunho Yang
136
0
0
26 Mar 2021
Alignment Knowledge Distillation for Online Streaming Attention-based
  Speech Recognition
Alignment Knowledge Distillation for Online Streaming Attention-based Speech RecognitionIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2021
Hirofumi Inaguma
Tatsuya Kawahara
414
19
0
28 Feb 2021
Thank you for Attention: A survey on Attention-based Artificial Neural
  Networks for Automatic Speech Recognition
Thank you for Attention: A survey on Attention-based Artificial Neural Networks for Automatic Speech RecognitionIntelligent Systems with Applications (ISA), 2021
Priyabrata Karmakar
S. Teng
Guojun Lu
184
37
0
14 Feb 2021
A Better and Faster End-to-End Model for Streaming ASR
A Better and Faster End-to-End Model for Streaming ASRIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2020
Yue Liu
Anmol Gulati
Jiahui Yu
Tara N. Sainath
Chung-Cheng Chiu
...
Wei Han
Qiao Liang
Yu Zhang
Trevor Strohman
Yonghui Wu
AuLLM
444
133
0
21 Nov 2020
Benchmarking LF-MMI, CTC and RNN-T Criteria for Streaming ASR
Benchmarking LF-MMI, CTC and RNN-T Criteria for Streaming ASR
Xiaohui Zhang
Frank Zhang
Chunxi Liu
Kjell Schubert
Julian Chan
...
Jun Liu
Ching-Feng Yeh
Fuchun Peng
Yatharth Saraf
Geoffrey Zweig
187
20
0
09 Nov 2020
Improving RNN Transducer Based ASR with Auxiliary Tasks
Improving RNN Transducer Based ASR with Auxiliary Tasks
Chunxi Liu
Frank Zhang
Duc Le
Suyoun Kim
Yatharth Saraf
Geoffrey Zweig
364
49
0
05 Nov 2020
Fluent and Low-latency Simultaneous Speech-to-Speech Translation with
  Self-adaptive Training
Fluent and Low-latency Simultaneous Speech-to-Speech Translation with Self-adaptive TrainingFindings (Findings), 2020
Renjie Zheng
Mingbo Ma
Baigong Zheng
Kaibo Liu
Jiahong Yuan
Kenneth Church
Liang Huang
251
16
0
20 Oct 2020
Parallel Rescoring with Transformer for Streaming On-Device Speech
  Recognition
Parallel Rescoring with Transformer for Streaming On-Device Speech RecognitionInterspeech (Interspeech), 2020
Wei Li
James Qin
Chung-Cheng Chiu
Ruoming Pang
Yanzhang He
233
15
0
30 Aug 2020
Large-scale Transfer Learning for Low-resource Spoken Language
  Understanding
Large-scale Transfer Learning for Low-resource Spoken Language UnderstandingInterspeech (Interspeech), 2020
X. Jia
Jianzong Wang
Zhiyong Zhang
Ning Cheng
Jing Xiao
216
17
0
13 Aug 2020
Online Automatic Speech Recognition with Listen, Attend and Spell Model
Online Automatic Speech Recognition with Listen, Attend and Spell ModelIEEE Signal Processing Letters (IEEE SPL), 2020
Roger Hsiao
Dogan Can
Tim Ng
R. Travadi
Arnab Ghoshal
RALM
168
19
0
12 Aug 2020
Streaming Transformer ASR with Blockwise Synchronous Beam Search
Streaming Transformer ASR with Blockwise Synchronous Beam Search
E. Tsunoo
Yosuke Kashiwagi
Shinji Watanabe
401
11
0
25 Jun 2020
On the Comparison of Popular End-to-End Models for Large Scale Speech
  Recognition
On the Comparison of Popular End-to-End Models for Large Scale Speech RecognitionInterspeech (Interspeech), 2020
Jinyu Li
Yu-Huan Wu
Yashesh Gaur
Chengyi Wang
Rui Zhao
Shujie Liu
375
142
0
28 May 2020
Low-Latency Sequence-to-Sequence Speech Recognition and Translation by
  Partial Hypothesis Selection
Low-Latency Sequence-to-Sequence Speech Recognition and Translation by Partial Hypothesis Selection
Danni Liu
Gerasimos Spanakis
Jan Niehues
201
65
0
22 May 2020
Enhancing Monotonic Multihead Attention for Streaming ASR
Enhancing Monotonic Multihead Attention for Streaming ASR
Hirofumi Inaguma
Masato Mimura
Tatsuya Kawahara
430
37
0
19 May 2020
CTC-synchronous Training for Monotonic Attention Model
CTC-synchronous Training for Monotonic Attention Model
Hirofumi Inaguma
Masato Mimura
Tatsuya Kawahara
229
7
0
10 May 2020
1
Page 1 of 1