ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2012.05481
  4. Cited By
Unified Streaming and Non-streaming Two-pass End-to-end Model for Speech
  Recognition
v1v2 (latest)

Unified Streaming and Non-streaming Two-pass End-to-end Model for Speech Recognition

10 December 2020
Binbin Zhang
Di Wu
Zhuoyuan Yao
Xiong Wang
F. Yu
Chao Yang
Liyong Guo
Yaguang Hu
Lei Xie
X. Lei
ArXiv (abs)PDFHTML

Papers citing "Unified Streaming and Non-streaming Two-pass End-to-end Model for Speech Recognition"

45 / 45 papers shown
Fewer Hallucinations, More Verification: A Three-Stage LLM-Based Framework for ASR Error Correction
Fewer Hallucinations, More Verification: A Three-Stage LLM-Based Framework for ASR Error Correction
Yangui Fang
Baixu Cheng
Jing Peng
Xu Li
Yu Xi
Chengwei Zhang
Guohui Zhong
370
7
0
24 Dec 2025
Chunk Based Speech Pre-training with High Resolution Finite Scalar Quantization
Chunk Based Speech Pre-training with High Resolution Finite Scalar Quantization
Yun Tang
Cindy Tseng
132
0
0
19 Sep 2025
In-domain SSL pre-training and streaming ASR
In-domain SSL pre-training and streaming ASR
J. Duret
Salima Mdhaffar
G. Laperriere
Ryan Whetten
Audrey Galametz
Catherine Kobus
Marion-Cécile Martin
Jo Oleiwan
Yannick Esteve
135
1
0
15 Sep 2025
Adapting Whisper for Streaming Speech Recognition via Two-Pass Decoding
Adapting Whisper for Streaming Speech Recognition via Two-Pass Decoding
Haoran Zhou
Xingchen Song
Brendan Fahy
Qiaochu Song
Binbin Zhang
...
Denglin Jiang
Apurv Verma
Vinay Ramesh
Srivas Prasad
Michele M. Franceschini
209
1
0
13 Jun 2025
Mamba for Streaming ASR Combined with Unimodal Aggregation
Mamba for Streaming ASR Combined with Unimodal AggregationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Ying Fang
Xiaofei Li
Mamba
265
11
0
30 Sep 2024
CUSIDE-T: Chunking, Simulating Future and Decoding for Transducer based
  Streaming ASR
CUSIDE-T: Chunking, Simulating Future and Decoding for Transducer based Streaming ASR
Wenbo Zhao
Ziwei Li
Chuan Yu
Zhijian Ou
AI4TS
281
4
0
14 Jul 2024
A framework of text-dependent speaker verification for chinese numerical
  string corpus
A framework of text-dependent speaker verification for chinese numerical string corpus
Litong Zheng
Feng Hong
Weijie Xu
Wan Zheng
334
1
0
11 May 2024
Skipformer: A Skip-and-Recover Strategy for Efficient Speech Recognition
Skipformer: A Skip-and-Recover Strategy for Efficient Speech RecognitionIEEE International Conference on Multimedia and Expo (ICME), 2024
Wenjing Zhu
Sining Sun
Changhao Shan
Peng Fan
Qing Yang
363
4
0
13 Mar 2024
R-BI: Regularized Batched Inputs enhance Incremental Decoding Framework
  for Low-Latency Simultaneous Speech Translation
R-BI: Regularized Batched Inputs enhance Incremental Decoding Framework for Low-Latency Simultaneous Speech Translation
Jiaxin Guo
Zhanglin Wu
Zongyao Li
Hengchao Shang
Daimeng Wei
Xiaoyu Chen
Zhiqiang Rao
Shaojun Li
Hao Yang
207
1
0
11 Jan 2024
UCorrect: An Unsupervised Framework for Automatic Speech Recognition
  Error Correction
UCorrect: An Unsupervised Framework for Automatic Speech Recognition Error CorrectionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Jiaxin Guo
Minghan Wang
Xiaosong Qiao
Daimeng Wei
Hengchao Shang
...
Yinglu Li
Yan Yu
Min Zhang
Shimin Tao
Hao Yang
206
6
0
11 Jan 2024
U2-KWS: Unified Two-pass Open-vocabulary Keyword Spotting with Keyword
  Bias
U2-KWS: Unified Two-pass Open-vocabulary Keyword Spotting with Keyword BiasAutomatic Speech Recognition & Understanding (ASRU), 2023
Aoting Zhang
Pan Zhou
Kaixun Huang
Yong Zou
Ming Liu
Lei Xie
233
8
0
15 Dec 2023
CDSD: Chinese Dysarthria Speech Database
CDSD: Chinese Dysarthria Speech DatabaseInterspeech (Interspeech), 2023
Mengyi Sun
Ming Gao
Xinchen Kang
Shiru Wang
Jun Du
Dengfeng Yao
Su-Jing Wang
421
9
0
24 Oct 2023
Chunked Attention-based Encoder-Decoder Model for Streaming Speech
  Recognition
Chunked Attention-based Encoder-Decoder Model for Streaming Speech RecognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Mohammad Zeineldeen
Albert Zeyer
Ralf Schluter
Hermann Ney
AuLLM
357
13
0
15 Sep 2023
Improving CTC-AED model with integrated-CTC and auxiliary loss
  regularization
Improving CTC-AED model with integrated-CTC and auxiliary loss regularization
Daobin Zhu
Xiangdong Su
Hongbin Zhang
258
2
0
15 Aug 2023
ApproBiVT: Lead ASR Models to Generalize Better Using Approximated
  Bias-Variance Tradeoff Guided Early Stopping and Checkpoint Averaging
ApproBiVT: Lead ASR Models to Generalize Better Using Approximated Bias-Variance Tradeoff Guided Early Stopping and Checkpoint Averaging
Fangyuan Wang
Ming Hao
Yuhai Shi
Bo Xu
MoMe
197
0
0
05 Aug 2023
Exploring the Integration of Large Language Models into Automatic Speech
  Recognition Systems: An Empirical Study
Exploring the Integration of Large Language Models into Automatic Speech Recognition Systems: An Empirical StudyInternational Conference on Neural Information Processing (ICONIP), 2023
Zeping Min
Jinbo Wang
AuLLM
218
20
0
13 Jul 2023
Hyper-parameter Adaptation of Conformer ASR Systems for Elderly and
  Dysarthric Speech Recognition
Hyper-parameter Adaptation of Conformer ASR Systems for Elderly and Dysarthric Speech RecognitionInterspeech (Interspeech), 2023
Tianzi Wang
Shoukang Hu
Jiajun Deng
Zengrui Jin
Mengzhe Geng
Yi Wang
Helen M. Meng
Xunying Liu
270
11
0
27 Jun 2023
DCTX-Conformer: Dynamic context carry-over for low latency unified
  streaming and non-streaming Conformer ASR
DCTX-Conformer: Dynamic context carry-over for low latency unified streaming and non-streaming Conformer ASRInterspeech (Interspeech), 2023
Goeric Huybrechts
S. Ronanki
Xilai Li
H. Nosrati
S. Bodapati
Katrin Kirchhoff
207
2
0
13 Jun 2023
Enhancing the Unified Streaming and Non-streaming Model with Contrastive
  Learning
Enhancing the Unified Streaming and Non-streaming Model with Contrastive LearningInterspeech (Interspeech), 2023
Yuting Yang
Yuke Li
Binbin Du
AI4TS
224
1
0
01 Jun 2023
Perception and Semantic Aware Regularization for Sequential Confidence
  Calibration
Perception and Semantic Aware Regularization for Sequential Confidence CalibrationComputer Vision and Pattern Recognition (CVPR), 2023
Zhenghua Peng
Yuanmao Luo
Tianshui Chen
Keke Xu
Shuangping Huang
AI4TS
316
4
0
31 May 2023
DualVC: Dual-mode Voice Conversion using Intra-model Knowledge
  Distillation and Hybrid Predictive Coding
DualVC: Dual-mode Voice Conversion using Intra-model Knowledge Distillation and Hybrid Predictive CodingInterspeech (Interspeech), 2023
Ziqian Ning
Yuepeng Jiang
Pengcheng Zhu
Jixun Yao
Shuai Wang
Linfu Xie
Mengxiao Bi
201
14
0
21 May 2023
ZeroPrompt: Streaming Acoustic Encoders are Zero-Shot Masked LMs
ZeroPrompt: Streaming Acoustic Encoders are Zero-Shot Masked LMsInterspeech (Interspeech), 2023
Xingcheng Song
Di Wu
Binbin Zhang
Zhendong Peng
Bo Dang
Fuping Pan
Zhiyong Wu
203
21
0
18 May 2023
Dynamic Chunk Convolution for Unified Streaming and Non-Streaming
  Conformer ASR
Dynamic Chunk Convolution for Unified Streaming and Non-Streaming Conformer ASRIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Xilai Li
Goeric Huybrechts
S. Ronanki
Jeffrey J. Farris
S. Bodapati
196
15
0
18 Apr 2023
Emphasizing Unseen Words: New Vocabulary Acquisition for End-to-End
  Speech Recognition
Emphasizing Unseen Words: New Vocabulary Acquisition for End-to-End Speech RecognitionNeural Networks (Neural Netw.), 2023
Leyuan Qu
C. Weber
S. Wermter
176
13
0
20 Feb 2023
E2E Segmentation in a Two-Pass Cascaded Encoder ASR Model
E2E Segmentation in a Two-Pass Cascaded Encoder ASR ModelIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Wenjie Huang
Shuo-yiin Chang
Tara N. Sainath
Yanzhang He
David Rybach
R. David
Rohit Prabhavalkar
Cyril Allauzen
Cal Peyser
Trevor Strohman
273
6
0
28 Nov 2022
SSCFormer: Push the Limit of Chunk-wise Conformer for Streaming ASR
  Using Sequentially Sampled Chunks and Chunked Causal Convolution
SSCFormer: Push the Limit of Chunk-wise Conformer for Streaming ASR Using Sequentially Sampled Chunks and Chunked Causal ConvolutionIEEE Signal Processing Letters (SPL), 2022
Fangyuan Wang
Bo Xu
Bo Xu
342
0
0
21 Nov 2022
Fast-U2++: Fast and Accurate End-to-End Speech Recognition in Joint
  CTC/Attention Frames
Fast-U2++: Fast and Accurate End-to-End Speech Recognition in Joint CTC/Attention FramesIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Che-Yuan Liang
Xiao-Lei Zhang
BinBin Zhang
Di Wu
Shengqiang Li
Xingcheng Song
Zhendong Peng
Fuping Pan
161
11
0
02 Nov 2022
Delay-penalized transducer for low-latency streaming ASR
Delay-penalized transducer for low-latency streaming ASRIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Wei Kang
Zengwei Yao
Fangjun Kuang
Liyong Guo
Xiaoyu Yang
Long lin
Piotr Żelasko
Daniel Povey
304
12
0
31 Oct 2022
Predicting Multi-Codebook Vector Quantization Indexes for Knowledge
  Distillation
Predicting Multi-Codebook Vector Quantization Indexes for Knowledge DistillationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Liyong Guo
Xiaoyu Yang
Quandong Wang
Yuxiang Kong
Zengwei Yao
...
Wei Kang
Long Lin
Mingshuang Luo
Piotr Żelasko
Daniel Povey
VLM
254
10
0
31 Oct 2022
Streaming Voice Conversion Via Intermediate Bottleneck Features And
  Non-streaming Teacher Guidance
Streaming Voice Conversion Via Intermediate Bottleneck Features And Non-streaming Teacher GuidanceIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Yuan-Jui Chen
Ming Tu
Tang-Chun Li
Xin Li
Qiuqiang Kong
Jiaxin Li
Zhichao Wang
Qiao Tian
Yuping Wang
Yuxuan Wang
256
17
0
27 Oct 2022
Linguistic-Enhanced Transformer with CTC Embedding for Speech
  Recognition
Linguistic-Enhanced Transformer with CTC Embedding for Speech RecognitionInternational Conference on Mobile Ad-hoc and Sensor Networks (MSN), 2022
Xulong Zhang
Jianzong Wang
Ning Cheng
Mengyuan Zhao
Zhiyong Zhang
Jing Xiao
130
1
0
25 Oct 2022
Improving Mandarin Speech Recogntion with Block-augmented Transformer
Improving Mandarin Speech Recogntion with Block-augmented Transformer
Xiaoming Ren
Huifeng Zhu
Liuwei Wei
Minghui Wu
Jie Hao
268
12
0
24 Jul 2022
Improving Streaming End-to-End ASR on Transformer-based Causal Models
  with Encoder States Revision Strategies
Improving Streaming End-to-End ASR on Transformer-based Causal Models with Encoder States Revision StrategiesInterspeech (Interspeech), 2022
Zehan Li
Haoran Miao
Keqi Deng
Gaofeng Cheng
Sanli Tian
Ta Li
Yonghong Yan
KELM
246
6
0
06 Jul 2022
Language-specific Characteristic Assistance for Code-switching Speech
  Recognition
Language-specific Characteristic Assistance for Code-switching Speech RecognitionInterspeech (Interspeech), 2022
Tongtong Song
Qiang Xu
Meng Ge
Longbiao Wang
Hao Shi
Yongjie Lv
Yuqin Lin
Jianwu Dang
229
36
0
29 Jun 2022
Streaming non-autoregressive model for any-to-many voice conversion
Streaming non-autoregressive model for any-to-many voice conversion
Ziyi Chen
Haoran Miao
Pengyuan Zhang
159
9
0
15 Jun 2022
PaddleSpeech: An Easy-to-Use All-in-One Speech Toolkit
PaddleSpeech: An Easy-to-Use All-in-One Speech ToolkitNorth American Chapter of the Association for Computational Linguistics (NAACL), 2022
Hui Zhang
Tian Yuan
Junkun Chen
Xintong Li
Renjie Zheng
...
Zeyu Chen
Xiaoguang Hu
Dianhai Yu
Yanjun Ma
Liang Huang
AuLLM
159
32
0
20 May 2022
Shifted Chunk Encoder for Transformer Based Streaming End-to-End ASR
Shifted Chunk Encoder for Transformer Based Streaming End-to-End ASRInternational Conference on Neural Information Processing (ICONIP), 2022
Fangyuan Wang
Bo Xu
216
5
0
29 Mar 2022
ASCEND: A Spontaneous Chinese-English Dataset for Code-switching in
  Multi-turn Conversation
ASCEND: A Spontaneous Chinese-English Dataset for Code-switching in Multi-turn Conversation
Holy Lovenia
Samuel Cahyawijaya
Genta Indra Winata
Peng Xu
Xu Yan
...
Elham J. Barezi
Qifeng Chen
Xiaojuan Ma
Bertram E. Shi
Pascale Fung
441
45
0
12 Dec 2021
Fast-MD: Fast Multi-Decoder End-to-End Speech Translation with
  Non-Autoregressive Hidden Intermediates
Fast-MD: Fast Multi-Decoder End-to-End Speech Translation with Non-Autoregressive Hidden IntermediatesAutomatic Speech Recognition & Understanding (ASRU), 2021
Hirofumi Inaguma
Siddharth Dalmia
Brian Yan
Shinji Watanabe
266
12
0
27 Sep 2021
Wav-BERT: Cooperative Acoustic and Linguistic Representation Learning
  for Low-Resource Speech Recognition
Wav-BERT: Cooperative Acoustic and Linguistic Representation Learning for Low-Resource Speech Recognition
Guolin Zheng
Yubei Xiao
Ke Gong
Pan Zhou
Xiaodan Liang
Liang Lin
227
28
0
19 Sep 2021
SimulLR: Simultaneous Lip Reading Transducer with Attention-Guided
  Adaptive Memory
SimulLR: Simultaneous Lip Reading Transducer with Attention-Guided Adaptive MemoryACM Multimedia (ACM MM), 2021
Zhijie Lin
Zhou Zhao
Haoyuan Li
Jinglin Liu
Meng Zhang
Xingshan Zeng
Xiaofei He
171
19
0
31 Aug 2021
Decoupling recognition and transcription in Mandarin ASR
Decoupling recognition and transcription in Mandarin ASR
Jiahong Yuan
Xingyu Cai
Dongji Gao
Renjie Zheng
Liang Huang
Kenneth Church
209
13
0
02 Aug 2021
U2++: Unified Two-pass Bidirectional End-to-end Model for Speech
  Recognition
U2++: Unified Two-pass Bidirectional End-to-end Model for Speech Recognition
Di Wu
Binbin Zhang
Chao Yang
Zhendong Peng
Wenjing Xia
Xiaoyu Chen
X. Lei
323
58
0
10 Jun 2021
WNARS: WFST based Non-autoregressive Streaming End-to-End Speech
  Recognition
WNARS: WFST based Non-autoregressive Streaming End-to-End Speech Recognition
Zhichao Wang
Wenwen Yang
Pan Zhou
Wei Chen
RALM
187
18
0
08 Apr 2021
WeNet: Production oriented Streaming and Non-streaming End-to-End Speech
  Recognition Toolkit
WeNet: Production oriented Streaming and Non-streaming End-to-End Speech Recognition ToolkitInterspeech (Interspeech), 2021
Zhuoyuan Yao
Di Wu
Xiong Wang
Binbin Zhang
Fan Yu
Chao Yang
Zhendong Peng
Xiaoyu Chen
Lei Xie
X. Lei
444
313
0
02 Feb 2021
1
Page 1 of 1