Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1801.00841
Cited By
Exploring Architectures, Data and Units For Streaming End-to-End Speech Recognition with RNN-Transducer
2 January 2018
Kanishka Rao
Hasim Sak
Rohit Prabhavalkar
AI4TS
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Exploring Architectures, Data and Units For Streaming End-to-End Speech Recognition with RNN-Transducer"
50 / 89 papers shown
Title
Aligner-Encoders: Self-Attention Transformers Can Be Self-Transducers
Adam Stooke
Rohit Prabhavalkar
K. Sim
P. M. Mengibar
39
0
0
06 Feb 2025
Disentangling Speakers in Multi-Talker Speech Recognition with Speaker-Aware CTC
Jiawen Kang
Lingwei Meng
Mingyu Cui
Yuejiao Wang
Xixin Wu
Xunying Liu
Helen Meng
41
2
0
19 Sep 2024
Improved Long-Form Speech Recognition by Jointly Modeling the Primary and Non-primary Speakers
Guru Prakash Arumugam
Shuo-yiin Chang
Tara N. Sainath
Rohit Prabhavalkar
Quan Wang
Shaan Bijwadia
29
3
0
18 Dec 2023
Streaming Anchor Loss: Augmenting Supervision with Temporal Significance
U. Sarawgi
John Berkowitz
Vineet Garg
Arnav Kundu
Minsik Cho
Sai Srujana Buddi
Saurabh N. Adya
Ahmed H. Tewfik
34
1
0
09 Oct 2023
ApproBiVT: Lead ASR Models to Generalize Better Using Approximated Bias-Variance Tradeoff Guided Early Stopping and Checkpoint Averaging
Fangyuan Wang
Ming Hao
Yuhai Shi
Bo Xu
MoMe
21
0
0
05 Aug 2023
TST: Time-Sparse Transducer for Automatic Speech Recognition
Xiaohui Zhang
Mangui Liang
Zhengkun Tian
Jiangyan Yi
J. Tao
14
0
0
17 Jul 2023
Reducing the gap between streaming and non-streaming Transducer-based ASR by adaptive two-stage knowledge distillation
Haitao Tang
Yu Fu
Lei Sun
Jiabin Xue
Dan Liu
...
Zhiqiang Ma
Minghui Wu
Jia Pan
Genshun Wan
Ming’En Zhao
29
2
0
27 Jun 2023
Tagged End-to-End Simultaneous Speech Translation Training using Simultaneous Interpretation Data
Yuka Ko
Ryo Fukuda
Yuta Nishikawa
Yasumasa Kano
Katsuhito Sudoh
Satoshi Nakamura
29
6
0
14 Jun 2023
CopyNE: Better Contextual ASR by Copying Named Entities
Shilin Zhou
Zhenghua Li
Yu Hong
Mengdi Zhang
Zhefeng Wang
Baoxing Huai
15
6
0
22 May 2023
A Deliberation-based Joint Acoustic and Text Decoder
S. Mavandadi
Tara N. Sainath
Ke Hu
Zelin Wu
21
7
0
23 Mar 2023
Pyramid Multi-branch Fusion DCNN with Multi-Head Self-Attention for Mandarin Speech Recognition
Kai Liu
Hailiang Xiong
Gangqiang Yang
Zhengfeng Du
Yewen Cao
D. Shah
18
0
0
23 Mar 2023
FoundationTTS: Text-to-Speech for ASR Customization with Generative Language Model
Rui Xue
Yanqing Liu
Lei He
Xuejiao Tan
Linquan Liu
Ed Lin
Sheng Zhao
34
7
0
06 Mar 2023
UML: A Universal Monolingual Output Layer for Multilingual ASR
Chaoyang Zhang
Bo-wen Li
Tara N. Sainath
Trevor Strohman
Shuo-yiin Chang
36
7
0
22 Feb 2023
Confidence Score Based Speaker Adaptation of Conformer Speech Recognition Systems
Jiajun Deng
Xurong Xie
Tianzi Wang
Mingyu Cui
Boyang Xue
Zengrui Jin
Guinan Li
Shujie Hu
Xunying Liu
26
5
0
15 Feb 2023
Lattice-Free Sequence Discriminative Training for Phoneme-Based Neural Transducers
Zijian Yang
Wei Zhou
Ralf Schluter
Hermann Ney
32
4
0
07 Dec 2022
Neural Transducer Training: Reduced Memory Consumption with Sample-wise Computation
Stefan Braun
Erik McDermott
Roger Hsiao
40
1
0
29 Nov 2022
Massively Multilingual ASR on 70 Languages: Tokenization, Architecture, and Generalization Capabilities
Andros Tjandra
Nayan Singhal
David C. Zhang
Ozlem Kalinli
Abdel-rahman Mohamed
Duc Le
M. Seltzer
37
12
0
10 Nov 2022
Phonetic-assisted Multi-Target Units Modeling for Improving Conformer-Transducer ASR system
Li Li
Dongxing Xu
Haoran Wei
Yanhua Long
21
2
0
03 Nov 2022
Factorized Blank Thresholding for Improved Runtime Efficiency of Neural Transducers
Duc Le
Frank Seide
Yuhao Wang
Heng Chang
Kjell Schubert
Ozlem Kalinli
M. Seltzer
19
6
0
02 Nov 2022
Partitioned Gradient Matching-based Data Subset Selection for Compute-Efficient Robust ASR Training
Ashish R. Mittal
D. Sivasubramanian
Rishabh K. Iyer
P. Jyothi
Ganesh Ramakrishnan
19
3
0
30 Oct 2022
LeVoice ASR Systems for the ISCSLP 2022 Intelligent Cockpit Speech Recognition Challenge
Yan Jia
Mihee Hong
Jingyu Hou
Kailong Ren
Sifan Ma
Jin Wang
Fangzhen Peng
Yinglin Ji
Lin Yang
Junjie Wang
25
1
0
14 Oct 2022
E-Branchformer: Branchformer with Enhanced merging for speech recognition
Kwangyoun Kim
Felix Wu
Yifan Peng
Jing Pan
Prashant Sridhar
Kyu Jeong Han
Shinji Watanabe
61
105
0
30 Sep 2022
Improving Mandarin Speech Recogntion with Block-augmented Transformer
Xiaoming Ren
Huifeng Zhu
Liuwei Wei
Minghui Wu
Jie Hao
38
9
0
24 Jul 2022
Improving Streaming End-to-End ASR on Transformer-based Causal Models with Encoder States Revision Strategies
Zehan Li
Haoran Miao
Keqi Deng
Gaofeng Cheng
Sanli Tian
Ta Li
Yonghong Yan
KELM
27
4
0
06 Jul 2022
On Comparison of Encoders for Attention based End to End Speech Recognition in Standalone and Rescoring Mode
Raviraj Joshi
Subodh Kumar
36
2
0
26 Jun 2022
Heterogeneous Data-Centric Architectures for Modern Data-Intensive Applications: Case Studies in Machine Learning and Databases
Geraldo F. Oliveira
Amirali Boroumand
Saugata Ghose
Juan Gómez Luna
O. Mutlu
28
7
0
29 May 2022
Improving CTC-based ASR Models with Gated Interlayer Collaboration
Yuting Yang
Yuke Li
Binbin Du
34
11
0
25 May 2022
Streaming parallel transducer beam search with fast-slow cascaded encoders
Jay Mahadeokar
Yangyang Shi
Ke Li
Duc Le
Jiedan Zhu
Vikas Chandra
Ozlem Kalinli
M. Seltzer
37
15
0
29 Mar 2022
Dynamic Latency for CTC-Based Streaming Automatic Speech Recognition With Emformer
J. Sun
Guiping Zhong
Dinghao Zhou
Baoxiang Li
21
0
0
29 Mar 2022
aaeCAPTCHA: The Design and Implementation of Audio Adversarial CAPTCHA
Md. Imran Hossen
X. Hei
31
4
0
05 Mar 2022
A Conformer Based Acoustic Model for Robust Automatic Speech Recognition
Yufeng Yang
Peidong Wang
DeLiang Wang
20
12
0
01 Mar 2022
Integrating Text Inputs For Training and Adapting RNN Transducer ASR Models
Samuel Thomas
Brian Kingsbury
G. Saon
H. Kuo
36
25
0
26 Feb 2022
Reducing language context confusion for end-to-end code-switching automatic speech recognition
Shuai Zhang
Jiangyan Yi
Zhengkun Tian
J. Tao
Y. Yeung
Liqun Deng
27
11
0
28 Jan 2022
Improving the fusion of acoustic and text representations in RNN-T
Chao Zhang
Bo-wen Li
Zhiyun Lu
Tara N. Sainath
Shuo-yiin Chang
AI4CE
43
12
0
25 Jan 2022
A Study of Transducer based End-to-End ASR with ESPnet: Architecture, Auxiliary Loss and Decoding Strategies
Florian Boyer
Yusuke Shinohara
Takaaki Ishii
Hirofumi Inaguma
Shinji Watanabe
35
34
0
14 Jan 2022
Speech-to-SQL: Towards Speech-driven SQL Query Generation From Natural Language Question
Yuanfeng Song
Raymond Chi-Wing Wong
Xuefang Zhao
Di Jiang
39
13
0
04 Jan 2022
Black-box Adversarial Attacks on Commercial Speech Platforms with Minimal Information
Baolin Zheng
Peipei Jiang
Qian Wang
Qi Li
Chao Shen
Cong Wang
Yunjie Ge
Qingyang Teng
Shenyi Zhang
AAML
18
69
0
19 Oct 2021
Back from the future: bidirectional CTC decoding using future information in speech recognition
Namkyu Jung
Geon-min Kim
Han-Gyu Kim
33
3
0
07 Oct 2021
Google Neural Network Models for Edge Devices: Analyzing and Mitigating Machine Learning Inference Bottlenecks
Amirali Boroumand
Saugata Ghose
Berkin Akin
Ravi Narayanaswami
Geraldo F. Oliveira
Xiaoyu Ma
Eric Shiu
O. Mutlu
25
82
0
29 Sep 2021
Factorized Neural Transducer for Efficient Language Model Adaptation
Xie Chen
Zhong Meng
S. Parthasarathy
Jinyu Li
21
39
0
27 Sep 2021
Integrating Dialog History into End-to-End Spoken Language Understanding Systems
Jatin Ganhotra
Samuel Thomas
H. Kuo
Sachindra Joshi
G. Saon
Zoltán Tüske
Brian Kingsbury
30
10
0
18 Aug 2021
Collaborative Training of Acoustic Encoders for Speech Recognition
Varun K. Nagaraja
Yangyang Shi
Ganesh Venkatesh
Ozlem Kalinli
M. Seltzer
Vikas Chandra
43
11
0
16 Jun 2021
Attention-based Neural Beamforming Layers for Multi-channel Speech Recognition
Bhargav Pulugundla
Yang Gao
Brian King
Gokce Keskin
Sri Harish Reddy Mallidi
Minhua Wu
J. Droppo
Roland Maas
27
2
0
12 May 2021
HMM-Free Encoder Pre-Training for Streaming RNN Transducer
Lu Huang
J. Sun
Yu Tang
Junfeng Hou
Jinkun Chen
Jun Zhang
Zejun Ma
25
3
0
02 Apr 2021
Advancing RNN Transducer Technology for Speech Recognition
G. Saon
Zoltan Tueske
Daniel Bolaños
Brian Kingsbury
43
86
0
17 Mar 2021
Mitigating Edge Machine Learning Inference Bottlenecks: An Empirical Study on Accelerating Google Edge Models
Amirali Boroumand
Saugata Ghose
Berkin Akin
Ravi Narayanaswami
Geraldo F. Oliveira
Xiaoyu Ma
Eric Shiu
O. Mutlu
27
28
0
01 Mar 2021
Echo State Speech Recognition
H. Shrivastava
Ankush Garg
Yuan Cao
Yu Zhang
Tara N. Sainath
50
22
0
18 Feb 2021
End-to-End Automatic Speech Recognition with Deep Mutual Learning
Ryo Masumura
Mana Ihori
Akihiko Takashima
Tomohiro Tanaka
Takanori Ashihara
27
5
0
16 Feb 2021
Bayesian Learning for Deep Neural Network Adaptation
Xurong Xie
Xunying Liu
Tan Lee
Lan Wang
BDL
22
20
0
14 Dec 2020
Less Is More: Improved RNN-T Decoding Using Limited Label Context and Path Merging
Rohit Prabhavalkar
Yanzhang He
David Rybach
S. Campbell
A. Narayanan
Trevor Strohman
Tara N. Sainath
52
35
0
12 Dec 2020
1
2
Next