Advancing RNN Transducer Technology for Speech Recognition

17 March 2021

Papers citing "Advancing RNN Transducer Technology for Speech Recognition"

49 / 49 papers shown

Title
Can Local Representation Alignment RNNs Solve Temporal Tasks? Nikolay Manchev Luis C. Garcia-Peraza-Herrera AI4TS 36 0 0 18 Apr 2025
Brain-inspired Artificial Intelligence: A Comprehensive Review Jing Ren Feng Xia AI4CE 32 3 0 27 Aug 2024
Soft Language Identification for Language-Agnostic Many-to-One End-to-End Speech Translation Peidong Wang Jian Xue Jinyu Li Junkun Chen Aswin Shanmugam Subramanian 31 0 0 12 Jun 2024
Boosting keyword spotting through on-device learnable user speech characteristics Cristian Cioflan Lukas Cavigelli Luca Benini 33 3 0 12 Mar 2024
Recurrent Aligned Network for Generalized Pedestrian Trajectory Prediction Yonghao Dong Le Wang Sanpin Zhou Gang Hua Changyin Sun 37 5 0 09 Mar 2024
Traditional Machine Learning Models and Bidirectional Encoder Representations From Transformer (BERT)-Based Automatic Classification of Tweets About Eating Disorders: Algorithm Development and Validation Study J. Benítez-Andrades José-Manuel Alija-Pérez Maria-Esther Vidal R. Pastor-Vargas María Teresa García-Ordás 19 36 0 08 Feb 2024
Semi-Autoregressive Streaming ASR With Label Context Siddhant Arora G. Saon Shinji Watanabe Brian Kingsbury AI4TS 23 5 0 19 Sep 2023
Multiple Representation Transfer from Large Language Models to End-to-End ASR Systems Takuma Udagawa Masayuki Suzuki Gakuto Kurata Masayasu Muraoka G. Saon 30 2 0 07 Sep 2023
CIF-T: A Novel CIF-based Transducer Architecture for Automatic Speech Recognition Tian-Hao Zhang Dinghao Zhou Guiping Zhong Jiaming Zhou Baoxiang Li 18 3 0 26 Jul 2023
Improving RNN-Transducers with Acoustic LookAhead Vinit Unni Ashish R. Mittal P. Jyothi Sunita Sarawagi 29 2 0 11 Jul 2023
Accelerating Transducers through Adjacent Token Merging Yuang Li Yu-Huan Wu Jinyu Li Shujie Liu 22 4 0 28 Jun 2023
End-to-End Speech Recognition: A Survey Rohit Prabhavalkar Takaaki Hori Tara N. Sainath Ralf Schluter Shinji Watanabe VLM 26 148 0 03 Mar 2023
Diagonal State Space Augmented Transformers for Speech Recognition G. Saon Ankit Gupta Xiaodong Cui AI4TS 27 26 0 27 Feb 2023
Confidence Score Based Speaker Adaptation of Conformer Speech Recognition Systems Jiajun Deng Xurong Xie Tianzi Wang Mingyu Cui Boyang Xue Zengrui Jin Guinan Li Shujie Hu Xunying Liu 26 5 0 15 Feb 2023
A Weakly-Supervised Streaming Multilingual Speech Model with Truly Zero-Shot Capability Jian Xue Peidong Wang Jinyu Li Eric Sun 26 10 0 04 Nov 2022
Partitioned Gradient Matching-based Data Subset Selection for Compute-Efficient Robust ASR Training Ashish R. Mittal D. Sivasubramanian Rishabh K. Iyer P. Jyothi Ganesh Ramakrishnan 17 3 0 30 Oct 2022
VQ-T: RNN Transducers using Vector-Quantized Prediction Network States Jiatong Shi G. Saon David Haws Shinji Watanabe Brian Kingsbury 27 3 0 03 Aug 2022
Multiple-hypothesis RNN-T Loss for Unsupervised Fine-tuning and Self-training of Neural Transducer Cong-Thanh Do Mohan Li R. Doddipatla 14 3 0 29 Jul 2022
Extending RNN-T-based speech recognition systems with emotion and language classification Zvi Kons Hagai Aronowitz E. Morais Matheus Damasceno H. Kuo Samuel Thomas G. Saon 14 5 0 28 Jul 2022
Improving Deliberation by Text-Only and Semi-Supervised Training Ke Hu Tara N. Sainath Yanzhang He Rohit Prabhavalkar Trevor Strohman S. Mavandadi Weiran Wang 26 12 0 29 Jun 2022
Improving the Training Recipe for a Robust Conformer-based Hybrid Model Mohammad Zeineldeen Jingjing Xu Christoph Luscher Ralf Schluter Hermann Ney 28 18 0 26 Jun 2022
Accelerating Inference and Language Model Fusion of Recurrent Neural Network Transducers via End-to-End 4-bit Quantization A. Fasoli Chia-Yu Chen Mauricio Serrano Swagath Venkataramani G. Saon Xiaodong Cui Brian Kingsbury K. Gopalakrishnan MQ 19 6 0 16 Jun 2022
A Deep Reinforcement Learning Blind AI in DareFightingICE Thai Van Nguyen Xincheng Dai Ibrahim Khan R. Thawonmas H. V. Pham VLM 23 7 0 16 May 2022
Efficient Training of Neural Transducer for Speech Recognition Wei Zhou Wilfried Michel Ralf Schluter Hermann Ney AI4TS 24 22 0 22 Apr 2022
Large-Scale Streaming End-to-End Speech Translation with Neural Transducers Jian Xue Peidong Wang Jinyu Li Matt Post Yashesh Gaur AI4TS 24 26 0 11 Apr 2022
Towards End-to-End Integration of Dialog History for Improved Spoken Language Understanding Vishal Sunder Samuel Thomas H. Kuo Jatin Ganhotra Brian Kingsbury Eric Fosler-Lussier VLM 41 10 0 11 Apr 2022
Effect and Analysis of Large-scale Language Model Rescoring on Competitive ASR Systems Takuma Udagawa Masayuki Suzuki Gakuto Kurata N. Itoh G. Saon 34 23 0 01 Apr 2022
Memory-Efficient Training of RNN-Transducer with Sampled Softmax Jaesong Lee Lukas Lee Shinji Watanabe 25 8 0 31 Mar 2022
Improving Generalization of Deep Neural Network Acoustic Models with Length Perturbation and N-best Based Label Smoothing Xiaodong Cui G. Saon Tohru Nagano Masayuki Suzuki Takashi Fukuda Brian Kingsbury Gakuto Kurata 29 7 0 29 Mar 2022
Towards Reducing the Need for Speech Training Data To Build Spoken Language Understanding Systems Samuel Thomas H. Kuo Brian Kingsbury G. Saon 14 24 0 26 Feb 2022
Integrating Text Inputs For Training and Adapting RNN Transducer ASR Models Samuel Thomas Brian Kingsbury G. Saon H. Kuo 20 25 0 26 Feb 2022
Improving End-to-End Models for Set Prediction in Spoken Language Understanding H. Kuo Zoltán Tüske Samuel Thomas Brian Kingsbury G. Saon 16 0 0 28 Jan 2022
Improving the fusion of acoustic and text representations in RNN-T Chao Zhang Bo-wen Li Zhiyun Lu Tara N. Sainath Shuo-yiin Chang AI4CE 43 12 0 25 Jan 2022
Improving Factored Hybrid HMM Acoustic Modeling without State Tying Tina Raissi Eugen Beck Ralf Schluter Hermann Ney 26 5 0 24 Jan 2022
Recent Advances in End-to-End Automatic Speech Recognition Jinyu Li VLM 26 362 0 02 Nov 2021
On Language Model Integration for RNN Transducer based Speech Recognition Wei Zhou Zuoyun Zheng Ralf Schluter Hermann Ney 29 22 0 13 Oct 2021
Input Length Matters: Improving RNN-T and MWER Training for Long-form Telephony Speech Recognition Zhiyun Lu Yanwei Pan Thibault Doutre Parisa Haghani Liangliang Cao Rohit Prabhavalkar C. Zhang Trevor Strohman AuLLM 72 14 0 08 Oct 2021
Towards efficient end-to-end speech recognition with biologically-inspired neural networks Thomas Bohnstingl Ayush Garg Stanislaw Wo'zniak G. Saon E. Eleftheriou A. Pantazi 24 5 0 04 Oct 2021
Residual Adapters for Parameter-Efficient ASR Adaptation to Atypical and Accented Speech Katrin Tomanek Vicky Zayats Dirk Padfield K. Vaillancourt Fadi Biadsy 53 57 0 14 Sep 2021
4-bit Quantization of LSTM-based Speech Recognition Models A. Fasoli Chia-Yu Chen Mauricio Serrano Xiao Sun Naigang Wang ... Xiaodong Cui Brian Kingsbury Wei Zhang Zoltán Tüske K. Gopalakrishnan MQ 26 21 0 27 Aug 2021
Reducing Exposure Bias in Training Recurrent Neural Network Transducers Xiaodong Cui Brian Kingsbury G. Saon David Haws Zoltán Tüske 13 5 0 24 Aug 2021
Integrating Dialog History into End-to-End Spoken Language Understanding Systems Jatin Ganhotra Samuel Thomas H. Kuo Sachindra Joshi G. Saon Zoltán Tüske Brian Kingsbury 22 10 0 18 Aug 2021
Advancing CTC-CRF Based End-to-End Speech Recognition with Wordpieces and Conformers Huahuan Zheng Wenjie Peng Zhijian Ou Jinsong Zhang 18 5 0 07 Jul 2021
On the limit of English conversational speech recognition Zoltán Tüske G. Saon Brian Kingsbury 19 50 0 03 May 2021
Scaling End-to-End Models for Large-Scale Multilingual ASR Bo-wen Li Ruoming Pang Tara N. Sainath Anmol Gulati Yu Zhang James Qin Parisa Haghani W. R. Huang Min Ma Junwen Bai CLL 26 76 0 30 Apr 2021
RNN Transducer Models For Spoken Language Understanding Samuel Thomas H. Kuo G. Saon Zoltán Tüske Brian Kingsbury Gakuto Kurata Zvi Kons R. Hoory 11 14 0 08 Apr 2021
Towards Consistent Hybrid HMM Acoustic Modeling Tina Raissi Eugen Beck Ralf Schluter Hermann Ney 11 5 0 06 Apr 2021
Minimum Bayes Risk Training of RNN-Transducer for End-to-End Speech Recognition Chao Weng Chengzhu Yu Jia Cui Chunlei Zhang Dong Yu 69 39 0 28 Nov 2019
Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation Yonghui Wu M. Schuster Z. Chen Quoc V. Le Mohammad Norouzi ... Alex Rudnick Oriol Vinyals G. Corrado Macduff Hughes J. Dean AIMat 716 6,743 0 26 Sep 2016