Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1510.08983
Cited By
Highway Long Short-Term Memory RNNs for Distant Speech Recognition
30 October 2015
Yu Zhang
Guoguo Chen
Dong Yu
Kaisheng Yao
Sanjeev Khudanpur
James R. Glass
3DV
AI4TS
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Highway Long Short-Term Memory RNNs for Distant Speech Recognition"
50 / 77 papers shown
Title
BERSting at the Screams: A Benchmark for Distanced, Emotional and Shouted Speech Recognition
Paige Tuttosi
Mantaj Dhillon
Luna Sang
Shane Eastwood
Poorvi Bhatia
Quang Minh Dinh
Avni Kapoor
Yewon Jin
Angelica Lim
38
0
0
30 Apr 2025
Streaming Bilingual End-to-End ASR model using Attention over Multiple Softmax
Aditya Patil
Vikas Joshi
Purvi Agrawal
Rupeshkumar Mehta
16
1
0
22 Jan 2024
Efficiency-oriented approaches for self-supervised speech representation learning
Luis Lugo
Valentin Vielzeuf
SSL
38
1
0
18 Dec 2023
Speak While You Think: Streaming Speech Synthesis During Text Generation
Avihu Dekel
Slava Shechtman
Raul Fernandez
David Haws
Zvi Kons
R. Hoory
27
8
0
20 Sep 2023
Mask the Correct Tokens: An Embarrassingly Simple Approach for Error Correction
Kai Shen
Yichong Leng
Xuejiao Tan
Si-Qi Tang
Yuan Zhang
Wenjie Liu
Ed Lin
32
13
0
23 Nov 2022
Deep Learning Enabled Semantic Communications with Speech Recognition and Synthesis
Zhenzi Weng
Zhijin Qin
Xiaoming Tao
Chengkang Pan
Guangyi Liu
Geoffrey Ye Li
44
132
0
09 May 2022
Relation Regularized Scene Graph Generation
Yuyu Guo
Lianli Gao
Jingkuan Song
Peng Wang
N. Sebe
Heng Tao Shen
Xuelong Li
34
14
0
22 Feb 2022
Improving the fusion of acoustic and text representations in RNN-T
Chao Zhang
Bo Li
Zhiyun Lu
Tara N. Sainath
Shuo-yiin Chang
AI4CE
43
12
0
25 Jan 2022
Recent Advances in End-to-End Automatic Speech Recognition
Jinyu Li
VLM
40
363
0
02 Nov 2021
FastCorrect 2: Fast Error Correction on Multiple Candidates for Automatic Speech Recognition
Yichong Leng
Xu Tan
Rui Wang
Linchen Zhu
Jin Xu
...
Linquan Liu
Tao Qin
Xiang-Yang Li
Ed Lin
Tie-Yan Liu
40
40
0
29 Sep 2021
VAD-free Streaming Hybrid CTC/Attention ASR for Unsegmented Recording
Hirofumi Inaguma
Tatsuya Kawahara
19
2
0
15 Jul 2021
FastCorrect: Fast Error Correction with Edit Alignment for Automatic Speech Recognition
Yichong Leng
Xu Tan
Linchen Zhu
Jin Xu
Renqian Luo
Linquan Liu
Tao Qin
Xiang-Yang Li
Ed Lin
Tie-Yan Liu
KELM
24
63
0
09 May 2021
HMM-Free Encoder Pre-Training for Streaming RNN Transducer
Lu Huang
J. Sun
Yu Tang
Junfeng Hou
Jinkun Chen
Jun Zhang
Zejun Ma
25
3
0
02 Apr 2021
Bilateral Control-Based Imitation Learning for Velocity-Controlled Robot
S. Sakaino
27
3
0
06 Mar 2021
Syntactic and Semantic-driven Learning for Open Information Extraction
Jialong Tang
Yaojie Lu
Hongyu Lin
Xianpei Han
Le Sun
Xinyan Xiao
Hua Wu
37
4
0
05 Mar 2021
Alignment Knowledge Distillation for Online Streaming Attention-based Speech Recognition
Hirofumi Inaguma
Tatsuya Kawahara
27
13
0
28 Feb 2021
Imitation Learning for Variable Speed Contact Motion for Operation up to Control Bandwidth
S. Sakaino
K. Fujimoto
Yuki Saigusa
T. Tsuji
45
11
0
20 Feb 2021
Wake Word Detection with Streaming Transformers
Yiming Wang
Hang Lv
Daniel Povey
Lei Xie
Sanjeev Khudanpur
AI4TS
36
33
0
08 Feb 2021
Educational Content Linking for Enhancing Learning Need Remediation in MOOCs
Shang-Wen Li
15
0
0
31 Dec 2020
Improving RNN Transducer Based ASR with Auxiliary Tasks
Chunxi Liu
Frank Zhang
Duc Le
Suyoun Kim
Yatharth Saraf
Geoffrey Zweig
31
49
0
05 Nov 2020
Alignment Restricted Streaming Recurrent Neural Network Transducer
Jay Mahadeokar
Yuan Shangguan
Duc Le
Gil Keren
Hang Su
Thong Le
Ching-Feng Yeh
Christian Fuegen
M. Seltzer
AI4TS
28
63
0
05 Nov 2020
Transformer in action: a comparative study of transformer-based acoustic models for large scale speech recognition applications
Yongqiang Wang
Yangyang Shi
Frank Zhang
Chunyang Wu
Julian Chan
Ching-Feng Yeh
Alex Xiao
20
21
0
27 Oct 2020
Cascaded encoders for unifying streaming and non-streaming ASR
A. Narayanan
Tara N. Sainath
Ruoming Pang
Jiahui Yu
Chung-Cheng Chiu
Rohit Prabhavalkar
Ehsan Variani
Trevor Strohman
AuLLM
8
85
0
27 Oct 2020
Improved Neural Language Model Fusion for Streaming Recurrent Neural Network Transducer
Suyoun Kim
Shangguan Yuan
Jay Mahadeokar
A. Bruguier
Christian Fuegen
M. Seltzer
Duc Le
23
28
0
26 Oct 2020
Developing Real-time Streaming Transformer Transducer for Speech Recognition on Large-scale Dataset
Xie Chen
Yu-Huan Wu
Zhenghao Wang
Shujie Liu
Jinyu Li
22
169
0
22 Oct 2020
Transfer Learning Approaches for Streaming End-to-End Speech Recognition System
Vikas Joshi
Rui Zhao
Rupeshkumar Mehta
Kshitiz Kumar
Jinyu Li
22
22
0
12 Aug 2020
Improving Recurrent Neural Network Responsiveness to Acute Clinical Events
D. Ledbetter
Eugene Laksana
M. Aczon
R. Wetzel
OOD
14
3
0
28 Jul 2020
Gated Recurrent Context: Softmax-free Attention for Online Encoder-Decoder Speech Recognition
Hyeonseung Lee
Woohyun Kang
Sung Jun Cheon
Hyeongju Kim
N. Kim
34
3
0
10 Jul 2020
Exploring Transformers for Large-Scale Speech Recognition
Liang Lu
Changliang Liu
Jinyu Li
Jiawei Liu
16
40
0
19 May 2020
Faster, Simpler and More Accurate Hybrid ASR Systems Using Wordpieces
Frank Zhang
Yongqiang Wang
Xiaohui Zhang
Chunxi Liu
Yatharth Saraf
Geoffrey Zweig
18
20
0
19 May 2020
Streaming Transformer-based Acoustic Models Using Self-attention with Augmented Memory
Chunyang Wu
Yongqiang Wang
Yangyang Shi
Ching-Feng Yeh
Frank Zhang
RALM
31
60
0
16 May 2020
CTC-synchronous Training for Monotonic Attention Model
Hirofumi Inaguma
Masato Mimura
Tatsuya Kawahara
12
7
0
10 May 2020
High-Accuracy and Low-Latency Speech Recognition with Two-Head Contextual Layer Trajectory LSTM Model
Jinyu Li
Rui Zhao
Eric Sun
J. H. M. Wong
Amit Das
Zhong Meng
Jiawei Liu
VLM
24
24
0
17 Mar 2020
Refined Gate: A Simple and Effective Gating Mechanism for Recurrent Units
Zhanzhan Cheng
Yunlu Xu
Mingjian Cheng
Yu Qiao
Shiliang Pu
Yi Niu
Fei Wu
16
8
0
26 Feb 2020
Scaling Up Online Speech Recognition Using ConvNets
Vineel Pratap
Qiantong Xu
Jacob Kahn
Gilad Avidov
Tatiana Likhomanenko
Awni Y. Hannun
Vitaliy Liptchinsky
Gabriel Synnaeve
R. Collobert
154
38
0
27 Jan 2020
Utterance-level Permutation Invariant Training with Latency-controlled BLSTM for Single-channel Multi-talker Speech Separation
Lu Huang
Gaofeng Cheng
Pengyuan Zhang
Yi Yang
Shumin Xu
Jiasong Sun
6
8
0
25 Dec 2019
A Syntax-aware Multi-task Learning Framework for Chinese Semantic Role Labeling
Qingrong Xia
Zhenghua Li
Min Zhang
28
17
0
12 Nov 2019
RNN-T For Latency Controlled ASR With Improved Beam Search
Mahaveer Jain
Kjell Schubert
Jay Mahadeokar
Ching-Feng Yeh
Kaustubh Kalgaonkar
Anuroop Sriram
Christian Fuegen
M. Seltzer
22
44
0
05 Nov 2019
Predicting word error rate for reverberant speech
H. Gamper
Dimitra Emmanouilidou
Sebastian Braun
I. Tashev
11
9
0
01 Nov 2019
A memory enhanced LSTM for modeling complex temporal dependencies
Sneha Aenugu
11
0
0
25 Oct 2019
An Empirical Study of Efficient ASR Rescoring with Transformers
Hongzhao Huang
Fuchun Peng
KELM
21
22
0
24 Oct 2019
G2G: TTS-Driven Pronunciation Learning for Graphemic Hybrid ASR
Duc Le
T. Koehler
Christian Fuegen
M. Seltzer
30
16
0
22 Oct 2019
Transformer-based Acoustic Modeling for Hybrid Speech Recognition
Yongqiang Wang
Abdel-rahman Mohamed
Duc Le
Chunxi Liu
Alex Xiao
...
Xiaohui Zhang
Frank Zhang
Christian Fuegen
Geoffrey Zweig
M. Seltzer
16
248
0
22 Oct 2019
From Senones to Chenones: Tied Context-Dependent Graphemes for Hybrid Speech Recognition
Duc Le
Xiaohui Zhang
Weiyi Zheng
C. Fügen
Geoffrey Zweig
M. Seltzer
20
63
0
02 Oct 2019
Improving RNN Transducer Modeling for End-to-End Speech Recognition
Jinyu Li
Rui Zhao
Hu Hu
Jiawei Liu
19
170
0
26 Sep 2019
Multilingual Graphemic Hybrid ASR with Massive Data Augmentation
Chunxi Liu
Qiaochu Zhang
Xiaohui Zhang
Kritika Singh
Yatharth Saraf
Geoffrey Zweig
29
27
0
14 Sep 2019
Feature-Set-Engineering for Detecting Freezing of Gait in Parkinson's Disease using Deep Recurrent Neural Networks
Spyroula Masiala
W. Huijbers
M. Atzmüller
14
19
0
08 Sep 2019
Syntax-aware Neural Semantic Role Labeling
Qingrong Xia
Zhenghua Li
Min Zhang
Meishan Zhang
Guohong Fu
Rui Wang
Luo Si
NAI
8
41
0
22 Jul 2019
Transfer Learning from Audio-Visual Grounding to Speech Recognition
Wei-Ning Hsu
David Harwath
James R. Glass
SSL
18
32
0
09 Jul 2019
To Tune or Not To Tune? How About the Best of Both Worlds?
Ran A. Wang
Haibo Su
Chunye Wang
Kailin Ji
J. Ding
VLM
36
17
0
09 Jul 2019
1
2
Next