v1v2 (latest)

Transformer-based Online CTC/attention End-to-End Speech Recognition Architecture

IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2020

15 January 2020

Pengyuan Zhang

Papers citing "Transformer-based Online CTC/attention End-to-End Speech Recognition Architecture"

33 / 33 papers shown

Spiralformer: Low Latency Encoder for Streaming Speech Recognition with Circular Layer Skipping and Early Exiting

134

01 Oct 2025

EgoSpeak: Learning When to Speak for Egocentric Conversational Agents in the WildNorth American Chapter of the Association for Computational Linguistics (NAACL), 2025

203

17 Feb 2025

Research on an improved Conformer end-to-end Speech Recognition Model with R-Drop Structure

226

14 Jun 2023

Streaming Speech-to-Confusion Network Speech RecognitionInterspeech (Interspeech), 2023

283

02 Jun 2023

Improved Training for End-to-End Streaming Automatic Speech Recognition Model with PunctuationInterspeech (Interspeech), 2023

142

02 Jun 2023

HYBRIDFORMER: improving SqueezeFormer with hybrid attention and NSR mechanismIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

Jiangyu Han

Heng Lu

169

15 Mar 2023

Language-Universal Adapter Learning with Knowledge Distillation for End-to-End Multilingual Speech Recognition

Zhijie Shen

Wu Guo

Bin Gu

348

28 Feb 2023

UFO2: A unified pre-training framework for online and offline speech recognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

360

26 Oct 2022

Attention Enhanced Citrinet for Speech RecognitionInterspeech (Interspeech), 2022

Xianchao Wu

202

01 Sep 2022

Deep Sparse Conformer for Speech RecognitionInterspeech (Interspeech), 2022

Xianchao Wu

134

01 Sep 2022

Intermediate-layer output Regularization for Attention-based Speech Recognition with Shared Decoder

282

09 Jul 2022

Improving Streaming End-to-End ASR on Transformer-based Causal Models with Encoder States Revision StrategiesInterspeech (Interspeech), 2022

295

06 Jul 2022

Boosting Cross-Domain Speech Recognition with Self-SupervisionIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2022

Pengyuan Zhang

407

20 Jun 2022

CUSIDE: Chunking, Simulating Future Context and Decoding for Streaming ASRInterspeech (Interspeech), 2022

Zhijian Ou

233

31 Mar 2022

WeNet 2.0: More Productive End-to-End Speech Recognition ToolkitInterspeech (Interspeech), 2022

Binbin Zhang

Chao Yang

352

133

29 Mar 2022

Run-and-back stitch search: novel block synchronous decoding for streaming encoder-decoder ASRIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

205

25 Jan 2022

Recent Advances in End-to-End Automatic Speech RecognitionAPSIPA Transactions on Signal and Information Processing (TASIP), 2021

Jinyu Li

VLM

548

444

02 Nov 2021

A Melody-Unsupervision Model for Singing Voice Synthesis

Soonbeom Choi

Juhan Nam

188

13 Oct 2021

Streaming End-to-End ASR based on Blockwise Non-Autoregressive ModelsInterspeech (Interspeech), 2021

Tianzi Wang

Yuya Fujita

Xuankai Chang

Shinji Watanabe

273

20 Jul 2021

Deformable TDNN with adaptive receptive fields for speech recognitionInterspeech (Interspeech), 2021

Keyu An

Yi Zhang

Zhijian Ou

122

30 Apr 2021

WNARS: WFST based Non-autoregressive Streaming End-to-End Speech Recognition

213

08 Apr 2021

Mutually-Constrained Monotonic Multihead Attention for Online ASRIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021

Jae-gyun Song

Hajin Shim

Eunho Yang

136

26 Mar 2021

Parallelizing Legendre Memory Unit TrainingInternational Conference on Machine Learning (ICML), 2021

Narsimha Chilkuri

C. Eliasmith

295

22 Feb 2021

Thank you for Attention: A survey on Attention-based Artificial Neural Networks for Automatic Speech RecognitionIntelligent Systems with Applications (ISA), 2021

Priyabrata Karmakar

S. Teng

Guojun Lu

184

14 Feb 2021

WeNet: Production oriented Streaming and Non-streaming End-to-End Speech Recognition ToolkitInterspeech (Interspeech), 2021

Binbin Zhang

Chao Yang

Lei Xie

492

316

02 Feb 2021

Fast offline Transformer-based end-to-end automatic speech recognition for real-world applicationsETRI Journal (ETRI J.), 2021

378

14 Jan 2021

Dual-mode ASR: Unify and Improve Streaming ASR with Full-context Modeling

410

12 Oct 2020

Super-Human Performance in Online Low-latency Recognition of Conversational Speech

425

07 Oct 2020

Large-scale Transfer Learning for Low-resource Spoken Language UnderstandingInterspeech (Interspeech), 2020

215

13 Aug 2020

Transformer with Bidirectional Decoder for Speech RecognitionInterspeech (Interspeech), 2020

Dandan Song

190

11 Aug 2020

Low-Latency Sequence-to-Sequence Speech Recognition and Translation by Partial Hypothesis Selection

Danni Liu

Gerasimos Spanakis

Jan Niehues

201

22 May 2020

A Comparison of Label-Synchronous and Frame-Synchronous End-to-End Models for Speech Recognition

Linhao Dong

206

20 May 2020

A Further Study of Unsupervised Pre-training for Transformer Based Speech Recognition

Wei Zou

Xiangang Li

SSL

268

20 May 2020