v1v2v3v4v5 (latest)

Streaming automatic speech recognition with the transformer model

IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2020

8 January 2020

Papers citing "Streaming automatic speech recognition with the transformer model"

50 / 115 papers shown

AdaCoach: A Virtual Coach for Training Customer Service Agents

184

27 Apr 2022

Blockwise Streaming Transformer for Spoken Language Understanding and Simultaneous Speech TranslationInterspeech (Interspeech), 2022

Keqi Deng

Shinji Watanabe

Jiatong Shi

Siddhant Arora

189

19 Apr 2022

An Investigation of Monotonic Transducers for Large-Scale Automatic Speech RecognitionSpoken Language Technology Workshop (SLT), 2022

387

19 Apr 2022

Personal VAD 2.0: Optimizing Personal Voice Activity Detection for On-Device Speech RecognitionInterspeech (Interspeech), 2022

210

08 Apr 2022

CUSIDE: Chunking, Simulating Future Context and Decoding for Streaming ASRInterspeech (Interspeech), 2022

Zhijian Ou

183

31 Mar 2022

Shifted Chunk Encoder for Transformer Based Streaming End-to-End ASRInternational Conference on Neural Information Processing (ICONIP), 2022

Fangyuan Wang

Bo Xu

182

29 Mar 2022

Transformer-based Streaming ASR with Cumulative AttentionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

Mohan Li

Shucong Zhang

Catalin Zorila

R. Doddipatla

200

11 Mar 2022

Run-and-back stitch search: novel block synchronous decoding for streaming encoder-decoder ASRIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

153

25 Jan 2022

A comparison of streaming models and data augmentation methods for robust speech recognitionAutomatic Speech Recognition & Understanding (ASRU), 2021

123

19 Nov 2021

Solving Probability and Statistics Problems by Program Synthesis

137

16 Nov 2021

Recent Advances in End-to-End Automatic Speech RecognitionAPSIPA Transactions on Signal and Information Processing (TASIP), 2021

Jinyu Li

VLM

434

431

02 Nov 2021

Sequence Transduction with Graph-based SupervisionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021

219

01 Nov 2021

Visualization: the missing factor in Simultaneous Speech TranslationItalian Conference on Computational Linguistics (CLiC-it), 2021

Sara Papi

Matteo Negri

Marco Turchi

216

31 Oct 2021

An Investigation of Enhancing CTC Model for Triggered Attention-based Streaming ASRAsia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 2021

20 Oct 2021

Study of positional encoding approaches for Audio Spectrogram Transformers

132

13 Oct 2021

SRU++: Pioneering Fast Recurrence with Attention for Speech RecognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021

Kwangyoun Kim

115

11 Oct 2021

VideoModerator: A Risk-aware Framework for Multimodal Video Moderation in E-CommerceIEEE Transactions on Visualization and Computer Graphics (TVCG), 2021

Lingyun Yu

186

08 Sep 2021

Streaming End-to-End ASR based on Blockwise Non-Autoregressive ModelsInterspeech (Interspeech), 2021

Tianzi Wang

Yuya Fujita

Xuankai Chang

Shinji Watanabe

238

20 Jul 2021

A Dialogue-based Information Extraction System for Medical Insurance Assessment

139

13 Jul 2021

Variational Information Bottleneck for Effective Low-resource Audio ClassificationInterspeech (Interspeech), 2021

Lei Chen

138

10 Jul 2021

Relaxed Attention: A Simple Method to Boost Performance of End-to-End Automatic Speech Recognition

152

02 Jul 2021

Dual Causal/Non-Causal Self-Attention for Streaming End-to-End Speech Recognition

Niko Moritz

Takaaki Hori

Jonathan Le Roux

147

02 Jul 2021

Pay Better Attention to Attention: Head Selection in Multilingual and Multi-Domain Sequence ModelingNeural Information Processing Systems (NeurIPS), 2021

Hongyu Gong

Yun Tang

J. Pino

Xian Li

213

21 Jun 2021

Direct Simultaneous Speech-to-Text Translation Assisted by Synchronized Streaming ASRFindings (Findings), 2021

225

11 Jun 2021

Bridging the gap between streaming and non-streaming ASR systems bydistilling ensembles of CTC and RNN-T modelsInterspeech (Interspeech), 2021

154

25 Apr 2021

Advanced Long-context End-to-end Speech Recognition Using Context-expanded TransformersInterspeech (Interspeech), 2021

145

19 Apr 2021

TransVG: End-to-End Visual Grounding with TransformersIEEE International Conference on Computer Vision (ICCV), 2021

646

442

17 Apr 2021

Capturing Multi-Resolution Context by Dilated Self-AttentionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021

Niko Moritz

Takaaki Hori

Jonathan Le Roux

144

07 Apr 2021

Extremely Low Footprint End-to-End ASR System for Smart DeviceInterspeech (Interspeech), 2021

118

06 Apr 2021

Mutually-Constrained Monotonic Multihead Attention for Online ASRIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021

Jae-gyun Song

Hajin Shim

Eunho Yang

103

26 Mar 2021

Transformer-based ASR Incorporating Time-reduction Layer and Fine-tuning with Self-Knowledge DistillationInterspeech (Interspeech), 2021

Md. Akmal Haidar

Chao Xing

Mehdi Rezagholizadeh

193

17 Mar 2021

Fine-tuning of Pre-trained End-to-end Speech Recognition with Generative Adversarial NetworksIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021

Md. Akmal Haidar

Mehdi Rezagholizadeh

267

10 Mar 2021

Alignment Knowledge Distillation for Online Streaming Attention-based Speech RecognitionIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2021

Hirofumi Inaguma

Tatsuya Kawahara

376

28 Feb 2021

Thank you for Attention: A survey on Attention-based Artificial Neural Networks for Automatic Speech RecognitionIntelligent Systems with Applications (ISA), 2021

Priyabrata Karmakar

S. Teng

Guojun Lu

143

14 Feb 2021

Motion-Based Handwriting Recognition and Word Reconstruction

Junshen Kevin Chen

Wanze Xie

Yutong He

183

15 Jan 2021

Fast offline Transformer-based end-to-end automatic speech recognition for real-world applicationsETRI Journal (ETRI J.), 2021

320

14 Jan 2021

s-Transformer: Segment-Transformer for Robust Neural Speech Synthesis

108

17 Nov 2020

Block-Online Guided Source SeparationSpoken Language Technology Workshop (SLT), 2020

Shota Horiguchi

Yusuke Fujita

Kenji Nagamatsu

150

16 Nov 2020

Dynamic latency speech recognition with asynchronous revision

Yang Zhang

163

03 Nov 2020

Semi-Supervised Speech Recognition via Graph-based Temporal ClassificationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2020

Niko Moritz

Takaaki Hori

Jonathan Le Roux

287

29 Oct 2020

CASS-NAT: CTC Alignment-based Single Step Non-autoregressive Transformer for Speech RecognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2020

156

28 Oct 2020

Transformer in action: a comparative study of transformer-based acoustic models for large scale speech recognition applicationsIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2020

290

27 Oct 2020

Universal ASR: Unifying Streaming and Non-Streaming ASR Using a Single Encoder-Decoder Model

130

27 Oct 2020

Transformer-based End-to-End Speech Recognition with Local Dense Synthesizer Attention

Menglong Xu

Shengqiang Li

Xiao-Lei Zhang

261

23 Oct 2020

Improving Streaming Automatic Speech Recognition With Non-Streaming Model Distillation On Unsupervised Data

299

22 Oct 2020

Developing Real-time Streaming Transformer Transducer for Speech Recognition on Large-scale Dataset

Xie Chen

284

200

22 Oct 2020

Emformer: Efficient Memory Transformer Based Acoustic Model For Low Latency Streaming Speech RecognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2020

798

190

21 Oct 2020

Dual-mode ASR: Unify and Improve Streaming ASR with Full-context Modeling

374

12 Oct 2020

Super-Human Performance in Online Low-latency Recognition of Conversational Speech

340

07 Oct 2020

Large-scale Transfer Learning for Low-resource Spoken Language UnderstandingInterspeech (Interspeech), 2020

167

13 Aug 2020