v1v2 (latest)

A Comparison of Label-Synchronous and Frame-Synchronous End-to-End Models for Speech Recognition

20 May 2020

Linhao Dong

Papers citing "A Comparison of Label-Synchronous and Frame-Synchronous End-to-End Models for Speech Recognition"

12 / 12 papers shown

Chunk Based Speech Pre-training with High Resolution Finite Scalar Quantization

Yun Tang

Cindy Tseng

146

19 Sep 2025

Transducer Consistency Regularization for Speech to Text ApplicationsSpoken Language Technology Workshop (SLT), 2024

Cindy Tseng

Yun Tang

Vijendra Raj Apsingekar

344

09 Oct 2024

Lightweight Transducer Based on Frame-Level CriterionInterspeech (Interspeech), 2024

351

05 Sep 2024

Incremental Blockwise Beam Search for Simultaneous Speech Translation with Controllable Quality-Latency TradeoffInterspeech (Interspeech), 2023

Shinji Watanabe

229

20 Sep 2023

CIF-T: A Novel CIF-based Transducer Architecture for Automatic Speech RecognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

Jiaming Zhou

321

26 Jul 2023

Integration of Frame- and Label-synchronous Beam Search for Streaming Encoder-decoder Speech RecognitionInterspeech (Interspeech), 2023

Shinji Watanabe

228

24 Jul 2023

Exploring Continuous Integrate-and-Fire for Adaptive Simultaneous Speech TranslationInterspeech (Interspeech), 2022

Chih-Chiang Chang

Hung-yi Lee

288

22 Mar 2022

Improving non-autoregressive end-to-end speech recognition with pre-trained acoustic and language modelsIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

Pengyuan Zhang

241

25 Jan 2022

CarneliNet: Neural Mixture Model for Automatic Speech Recognition

A. Kalinov

Somshubra Majumdar

Jagadeesh Balam

Boris Ginsburg

MoE

128

22 Jul 2021

VAD-free Streaming Hybrid CTC/Attention ASR for Unsegmented RecordingInterspeech (Interspeech), 2021

Hirofumi Inaguma

Tatsuya Kawahara

271

15 Jul 2021

Efficiently Fusing Pretrained Acoustic and Linguistic Encoders for Low-resource Speech RecognitionIEEE Signal Processing Letters (IEEE SPL), 2021

Cheng Yi

Shiyu Zhou

Bo Xu

242

17 Jan 2021

AV Taris: Online Audio-Visual Speech Recognition

George Sterpu

N. Harte

197

14 Dec 2020