Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2001.02674
Cited By
v1
v2
v3
v4
v5 (latest)
Streaming automatic speech recognition with the transformer model
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2020
8 January 2020
Niko Moritz
Takaaki Hori
Jonathan Le Roux
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Streaming automatic speech recognition with the transformer model"
50 / 115 papers shown
AdaCoach: A Virtual Coach for Training Customer Service Agents
Shuang Peng
Shuai Zhu
Minghui Yang
Haozhou Huang
Dan Liu
Zujie Wen
Xuelian Li
Biao Fan
184
0
0
27 Apr 2022
Blockwise Streaming Transformer for Spoken Language Understanding and Simultaneous Speech Translation
Interspeech (Interspeech), 2022
Keqi Deng
Shinji Watanabe
Jiatong Shi
Siddhant Arora
189
15
0
19 Apr 2022
An Investigation of Monotonic Transducers for Large-Scale Automatic Speech Recognition
Spoken Language Technology Workshop (SLT), 2022
Niko Moritz
Frank Seide
Duc Le
Jay Mahadeokar
Christian Fuegen
387
10
0
19 Apr 2022
Personal VAD 2.0: Optimizing Personal Voice Activity Detection for On-Device Speech Recognition
Interspeech (Interspeech), 2022
Shaojin Ding
R. Rikhye
Qiao Liang
Yanzhang He
Quan Wang
A. Narayanan
Tom O'Malley
Ian McGraw
210
39
0
08 Apr 2022
CUSIDE: Chunking, Simulating Future Context and Decoding for Streaming ASR
Interspeech (Interspeech), 2022
Keyu An
Huahuan Zheng
Zhijian Ou
Hongyu Xiang
Ke Ding
Guanglu Wan
AI4TS
183
21
0
31 Mar 2022
Shifted Chunk Encoder for Transformer Based Streaming End-to-End ASR
International Conference on Neural Information Processing (ICONIP), 2022
Fangyuan Wang
Bo Xu
182
5
0
29 Mar 2022
Transformer-based Streaming ASR with Cumulative Attention
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Mohan Li
Shucong Zhang
Catalin Zorila
R. Doddipatla
200
11
0
11 Mar 2022
Run-and-back stitch search: novel block synchronous decoding for streaming encoder-decoder ASR
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
E. Tsunoo
Chaitanya Narisetty
Michael Hentschel
Yosuke Kashiwagi
Shinji Watanabe
153
3
0
25 Jan 2022
A comparison of streaming models and data augmentation methods for robust speech recognition
Automatic Speech Recognition & Understanding (ASRU), 2021
Jiyeon Kim
Mehul Kumar
Dhananjaya N. Gowda
Abhinav Garg
Chanwoo Kim
123
6
0
19 Nov 2021
Solving Probability and Statistics Problems by Program Synthesis
Leonard Tang
Elizabeth Ke
Nikhil Singh
Nakul Verma
Iddo Drori
137
15
0
16 Nov 2021
Recent Advances in End-to-End Automatic Speech Recognition
APSIPA Transactions on Signal and Information Processing (TASIP), 2021
Jinyu Li
VLM
434
431
0
02 Nov 2021
Sequence Transduction with Graph-based Supervision
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021
Niko Moritz
Takaaki Hori
Shinji Watanabe
Jonathan Le Roux
219
7
0
01 Nov 2021
Visualization: the missing factor in Simultaneous Speech Translation
Italian Conference on Computational Linguistics (CLiC-it), 2021
Sara Papi
Matteo Negri
Marco Turchi
216
2
0
31 Oct 2021
An Investigation of Enhancing CTC Model for Triggered Attention-based Streaming ASR
Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 2021
Huaibo Zhao
Yosuke Higuchi
Tetsuji Ogawa
Tetsunori Kobayashi
97
4
0
20 Oct 2021
Study of positional encoding approaches for Audio Spectrogram Transformers
L. Pepino
Pablo Riera
Luciana Ferrer
ViT
132
7
0
13 Oct 2021
SRU++: Pioneering Fast Recurrence with Attention for Speech Recognition
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021
Jing Pan
Tao Lei
Kwangyoun Kim
Kyu Jeong Han
Shinji Watanabe
VLM
115
12
0
11 Oct 2021
VideoModerator: A Risk-aware Framework for Multimodal Video Moderation in E-Commerce
IEEE Transactions on Visualization and Computer Graphics (TVCG), 2021
Tan Tang
Yanhong Wu
Lingyun Yu
Yuhong Li
Yingcai Wu
186
30
0
08 Sep 2021
Streaming End-to-End ASR based on Blockwise Non-Autoregressive Models
Interspeech (Interspeech), 2021
Tianzi Wang
Yuya Fujita
Xuankai Chang
Shinji Watanabe
238
17
0
20 Jul 2021
A Dialogue-based Information Extraction System for Medical Insurance Assessment
Shuang Peng
Mengdi Zhou
Minghui Yang
Haitao Mi
Shaosheng Cao
Zujie Wen
Teng Xu
Hongbin Wang
Lei Liu
139
4
0
13 Jul 2021
Variational Information Bottleneck for Effective Low-resource Audio Classification
Interspeech (Interspeech), 2021
Shijing Si
Jianzong Wang
Huiming Sun
Jianhan Wu
Chuan Zhang
Xiaoyang Qu
Ning Cheng
Lei Chen
Jing Xiao
138
15
0
10 Jul 2021
Relaxed Attention: A Simple Method to Boost Performance of End-to-End Automatic Speech Recognition
Timo Lohrenz
P. Schwarz
Zhengyang Li
Tim Fingscheidt
152
11
0
02 Jul 2021
Dual Causal/Non-Causal Self-Attention for Streaming End-to-End Speech Recognition
Niko Moritz
Takaaki Hori
Jonathan Le Roux
147
23
0
02 Jul 2021
Pay Better Attention to Attention: Head Selection in Multilingual and Multi-Domain Sequence Modeling
Neural Information Processing Systems (NeurIPS), 2021
Hongyu Gong
Yun Tang
J. Pino
Xian Li
213
13
0
21 Jun 2021
Direct Simultaneous Speech-to-Text Translation Assisted by Synchronized Streaming ASR
Findings (Findings), 2021
Junkun Chen
Mingbo Ma
Renjie Zheng
Liang Huang
225
36
0
11 Jun 2021
Bridging the gap between streaming and non-streaming ASR systems bydistilling ensembles of CTC and RNN-T models
Interspeech (Interspeech), 2021
Thibault Doutre
Wei Han
Chung-Cheng Chiu
Ruoming Pang
Olivier Siohan
Liangliang Cao
154
6
0
25 Apr 2021
Advanced Long-context End-to-end Speech Recognition Using Context-expanded Transformers
Interspeech (Interspeech), 2021
Takaaki Hori
Niko Moritz
Chiori Hori
Jonathan Le Roux
145
37
0
19 Apr 2021
TransVG: End-to-End Visual Grounding with Transformers
IEEE International Conference on Computer Vision (ICCV), 2021
Jiajun Deng
Zhengyuan Yang
Tianlang Chen
Wen-gang Zhou
Houqiang Li
ViT
646
442
0
17 Apr 2021
Capturing Multi-Resolution Context by Dilated Self-Attention
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021
Niko Moritz
Takaaki Hori
Jonathan Le Roux
144
8
0
07 Apr 2021
Extremely Low Footprint End-to-End ASR System for Smart Device
Interspeech (Interspeech), 2021
Zhifu Gao
Yiwu Yao
Shiliang Zhang
Jun Yang
Ming Lei
Ian Mcloughlin
118
15
0
06 Apr 2021
Mutually-Constrained Monotonic Multihead Attention for Online ASR
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021
Jae-gyun Song
Hajin Shim
Eunho Yang
103
0
0
26 Mar 2021
Transformer-based ASR Incorporating Time-reduction Layer and Fine-tuning with Self-Knowledge Distillation
Interspeech (Interspeech), 2021
Md. Akmal Haidar
Chao Xing
Mehdi Rezagholizadeh
193
6
0
17 Mar 2021
Fine-tuning of Pre-trained End-to-end Speech Recognition with Generative Adversarial Networks
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021
Md. Akmal Haidar
Mehdi Rezagholizadeh
267
9
0
10 Mar 2021
Alignment Knowledge Distillation for Online Streaming Attention-based Speech Recognition
IEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2021
Hirofumi Inaguma
Tatsuya Kawahara
376
19
0
28 Feb 2021
Thank you for Attention: A survey on Attention-based Artificial Neural Networks for Automatic Speech Recognition
Intelligent Systems with Applications (ISA), 2021
Priyabrata Karmakar
S. Teng
Guojun Lu
143
37
0
14 Feb 2021
Motion-Based Handwriting Recognition and Word Reconstruction
Junshen Kevin Chen
Wanze Xie
Yutong He
183
1
0
15 Jan 2021
Fast offline Transformer-based end-to-end automatic speech recognition for real-world applications
ETRI Journal (ETRI J.), 2021
Y. Oh
Kiyoung Park
Jeongue Park
OffRL
320
6
0
14 Jan 2021
s-Transformer: Segment-Transformer for Robust Neural Speech Synthesis
Xi Wang
Huaiping Ming
Lei He
Frank Soong
108
5
0
17 Nov 2020
Block-Online Guided Source Separation
Spoken Language Technology Workshop (SLT), 2020
Shota Horiguchi
Yusuke Fujita
Kenji Nagamatsu
150
4
0
16 Nov 2020
Dynamic latency speech recognition with asynchronous revision
Mingkun Huang
Meng Cai
Jun Zhang
Yang Zhang
Yongbin You
Yi He
Zejun Ma
BDL
163
3
0
03 Nov 2020
Semi-Supervised Speech Recognition via Graph-based Temporal Classification
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2020
Niko Moritz
Takaaki Hori
Jonathan Le Roux
287
30
0
29 Oct 2020
CASS-NAT: CTC Alignment-based Single Step Non-autoregressive Transformer for Speech Recognition
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2020
Ruchao Fan
Wei Chu
Peng Chang
Jing Xiao
156
42
0
28 Oct 2020
Transformer in action: a comparative study of transformer-based acoustic models for large scale speech recognition applications
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2020
Yongqiang Wang
Yangyang Shi
Frank Zhang
Chunyang Wu
Julian Chan
Ching-Feng Yeh
Alex Xiao
290
28
0
27 Oct 2020
Universal ASR: Unifying Streaming and Non-Streaming ASR Using a Single Encoder-Decoder Model
Zhifu Gao
Shiliang Zhang
Ming Lei
Ian Mcloughlin
CVBM
130
17
0
27 Oct 2020
Transformer-based End-to-End Speech Recognition with Local Dense Synthesizer Attention
Menglong Xu
Shengqiang Li
Xiao-Lei Zhang
261
36
0
23 Oct 2020
Improving Streaming Automatic Speech Recognition With Non-Streaming Model Distillation On Unsupervised Data
Thibault Doutre
Wei Han
Min Ma
Zhiyun Lu
Chung-Cheng Chiu
Ruoming Pang
A. Narayanan
Ananya Misra
Yu Zhang
Liangliang Cao
299
24
0
22 Oct 2020
Developing Real-time Streaming Transformer Transducer for Speech Recognition on Large-scale Dataset
Xie Chen
Yu-Huan Wu
Zhenghao Wang
Shujie Liu
Jinyu Li
284
200
0
22 Oct 2020
Emformer: Efficient Memory Transformer Based Acoustic Model For Low Latency Streaming Speech Recognition
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2020
Yangyang Shi
Yongqiang Wang
Chunyang Wu
Ching-Feng Yeh
Julian Chan
Frank Zhang
Duc Le
M. Seltzer
798
190
0
21 Oct 2020
Dual-mode ASR: Unify and Improve Streaming ASR with Full-context Modeling
Jiahui Yu
Wei Han
Anmol Gulati
Chung-Cheng Chiu
Yue Liu
Tara N. Sainath
Yonghui Wu
Ruoming Pang
374
19
0
12 Oct 2020
Super-Human Performance in Online Low-latency Recognition of Conversational Speech
T. Nguyen
S. Stueker
A. Waibel
BDL
340
43
0
07 Oct 2020
Large-scale Transfer Learning for Low-resource Spoken Language Understanding
Interspeech (Interspeech), 2020
X. Jia
Jianzong Wang
Zhiyong Zhang
Ning Cheng
Jing Xiao
167
17
0
13 Aug 2020
Previous
1
2
3
Next
Page 2 of 3