Title
Pruned RNN-T for fast, memory-efficient ASR training Fangjun Kuang Liyong Guo Wei Kang Long Lin Mingshuang Luo Zengwei Yao Daniel Povey 27 64 0 23 Jun 2022
Heterogeneous Data-Centric Architectures for Modern Data-Intensive Applications: Case Studies in Machine Learning and Databases Geraldo F. Oliveira Amirali Boroumand Saugata Ghose Juan Gómez Luna O. Mutlu 28 7 0 29 May 2022
Contextual Adapters for Personalized Speech Recognition in Neural Transducers Kanthashree Mysore Sathyendra Thejaswi Muniyappa Feng-Ju Chang Jing Liu Jinru Su Grant P. Strimel Athanasios Mouchtaris Siegfried Kunzmann 19 75 0 26 May 2022
Minimising Biasing Word Errors for Contextual ASR with the Tree-Constrained Pointer Generator Guangzhi Sun C. Zhang P. Woodland 34 14 0 18 May 2022
Deep Learning Enabled Semantic Communications with Speech Recognition and Synthesis Zhenzi Weng Zhijin Qin Xiaoming Tao Chengkang Pan Guangyi Liu Geoffrey Ye Li 41 132 0 09 May 2022
Personal VAD 2.0: Optimizing Personal Voice Activity Detection for On-Device Speech Recognition Shaojin Ding R. Rikhye Qiao Liang Yanzhang He Quan Wang A. Narayanan Tom O'Malley Ian McGraw 26 27 0 08 Apr 2022
4-bit Conformer with Native Quantization Aware Training for Speech Recognition Shaojin Ding Phoenix Meadowlark Yanzhang He Lukasz Lew Shivani Agrawal Oleg Rybakov MQ 31 32 0 29 Mar 2022
Streaming parallel transducer beam search with fast-slow cascaded encoders Jay Mahadeokar Yangyang Shi Ke Li Duc Le Jiedan Zhu Vikas Chandra Ozlem Kalinli M. Seltzer 35 15 0 29 Mar 2022
WeNet 2.0: More Productive End-to-End Speech Recognition Toolkit Binbin Zhang Di Wu Zhendong Peng Xingcheng Song Zhuoyuan Yao Hang Lv Linfu Xie Chao Yang Fuping Pan Jianwei Niu VLM 29 94 0 29 Mar 2022
Enhance Language Identification using Dual-mode Model with Knowledge Distillation Hexin Liu Leibny Paola García Perera Andy W. H. Khong Justin Dauwels S. Styles Sanjeev Khudanpur VLM 30 5 0 07 Mar 2022
Towards Contextual Spelling Correction for Customization of End-to-end Speech Recognition Systems Xiaoqiang Wang Yanqing Liu Jinyu Li Veljko Miljanic Sheng Zhao H. Khalil KELM 13 18 0 02 Mar 2022
Integrating Text Inputs For Training and Adapting RNN Transducer ASR Models Samuel Thomas Brian Kingsbury G. Saon H. Kuo 36 25 0 26 Feb 2022
Spanish and English Phoneme Recognition by Training on Simulated Classroom Audio Recordings of Collaborative Learning Environments Mario Esparza 24 0 0 21 Feb 2022
RescoreBERT: Discriminative Speech Recognition Rescoring with BERT Liyan Xu Yile Gu J. Kolehmainen Haidar Khan Ankur Gandhe Ariya Rastrow A. Stolcke I. Bulyko 42 46 0 02 Feb 2022
Neural-FST Class Language Model for End-to-End Speech Recognition A. Bruguier Duc Le Rohit Prabhavalkar Dangna Li Zhe Liu Bo Wang Eun Chang Fuchun Peng Ozlem Kalinli M. Seltzer 20 6 0 28 Jan 2022
Improving the fusion of acoustic and text representations in RNN-T Chao Zhang Bo-wen Li Zhiyun Lu Tara N. Sainath Shuo-yiin Chang AI4CE 43 12 0 25 Jan 2022
A Study of Transducer based End-to-End ASR with ESPnet: Architecture, Auxiliary Loss and Decoding Strategies Florian Boyer Yusuke Shinohara Takaaki Ishii Hirofumi Inaguma Shinji Watanabe 35 34 0 14 Jan 2022
A Likelihood Ratio based Domain Adaptation Method for E2E Models Chhavi Choudhury Ankur Gandhe Xiaohan Ding I. Bulyko 27 10 0 10 Jan 2022
Mixed Precision Low-bit Quantization of Neural Network Language Models for Speech Recognition Junhao Xu Jianwei Yu Shoukang Hu Xunying Liu Helen Meng MQ 30 13 0 29 Nov 2021
Efficient Softmax Approximation for Deep Neural Networks with Attention Mechanism Ihor Vasyltsov Wooseok Chang 33 12 0 21 Nov 2021
Context-Aware Transformer Transducer for Speech Recognition Feng-Ju Chang Jing Liu Martin H. Radfar Athanasios Mouchtaris M. Omologo Ariya Rastrow Siegfried Kunzmann 21 79 0 05 Nov 2021
Recent Advances in End-to-End Automatic Speech Recognition Jinyu Li VLM 35 363 0 02 Nov 2021
Differentiable NAS Framework and Application to Ads CTR Prediction Ravi Krishna Aravind Kalaiah Bichen Wu Maxim Naumov Dheevatsa Mudigere M. Smelyanskiy Kurt Keutzer 28 8 0 25 Oct 2021
Data-Driven Offline Optimization For Architecting Hardware Accelerators Aviral Kumar Amir Yazdanbakhsh Milad Hashemi Kevin Swersky Sergey Levine 27 36 0 20 Oct 2021
Scribosermo: Fast Speech-to-Text models for German and other Languages Daniel Bermuth Alexander Poeppel W. Reif 29 9 0 15 Oct 2021
Universal Paralinguistic Speech Representations Using Self-Supervised Conformers Joel Shor A. Jansen Wei Han Daniel S. Park Yu Zhang SSL AI4TS 43 54 0 09 Oct 2021
Personalized Automatic Speech Recognition Trained on Small Disordered Speech Datasets Jimmy Tobin Katrin Tomanek 27 27 0 09 Oct 2021
Streaming Transformer Transducer Based Speech Recognition Using Non-Causal Convolution Yangyang Shi Chunyang Wu Dilin Wang Alex Xiao Jay Mahadeokar ... Ke Li Yuan Shangguan Varun K. Nagaraja Ozlem Kalinli M. Seltzer 36 15 0 07 Oct 2021
Towards efficient end-to-end speech recognition with biologically-inspired neural networks Thomas Bohnstingl Ayush Garg Stanislaw Wo'zniak G. Saon E. Eleftheriou A. Pantazi 29 5 0 04 Oct 2021
Federated Learning in ASR: Not as Easy as You Think Wentao Yu J. Freiwald Soren Tewes F. Huennemeyer D. Kolossa FedML 27 17 0 30 Sep 2021
Google Neural Network Models for Edge Devices: Analyzing and Mitigating Machine Learning Inference Bottlenecks Amirali Boroumand Saugata Ghose Berkin Akin Ravi Narayanaswami Geraldo F. Oliveira Xiaoyu Ma Eric Shiu O. Mutlu 25 82 0 29 Sep 2021
Private Language Model Adaptation for Speech Recognition Zhe Liu Ke Li Shreyan Bakshi Fuchun Peng 34 6 0 28 Sep 2021
BigSSL: Exploring the Frontier of Large-Scale Semi-Supervised Learning for Automatic Speech Recognition Yu Zhang Daniel S. Park Wei Han James Qin Anmol Gulati ... Zhifeng Chen Quoc V. Le Chung-Cheng Chiu Ruoming Pang Yonghui Wu SSL 27 175 0 27 Sep 2021
Factorized Neural Transducer for Efficient Language Model Adaptation Xie Chen Zhong Meng S. Parthasarathy Jinyu Li 21 39 0 27 Sep 2021
Turn-to-Diarize: Online Speaker Diarization Constrained by Transformer Transducer Speaker Turn Detection Wei Xia Han Lu Quan Wang Anshuman Tripathi Yiling Huang Ignacio López Moreno Hasim Sak 46 51 0 23 Sep 2021
iRNN: Integer-only Recurrent Neural Network Eyyub Sari Vanessa Courville V. Nia MQ 56 4 0 20 Sep 2021
Tied & Reduced RNN-T Decoder Rami Botros Tara N. Sainath R. David Emmanuel Guzman Wei Li Yanzhang He 38 55 0 15 Sep 2021
Residual Adapters for Parameter-Efficient ASR Adaptation to Atypical and Accented Speech Katrin Tomanek Vicky Zayats Dirk Padfield K. Vaillancourt Fadi Biadsy 59 57 0 14 Sep 2021
4-bit Quantization of LSTM-based Speech Recognition Models A. Fasoli Chia-Yu Chen Mauricio Serrano Xiao Sun Naigang Wang ... Xiaodong Cui Brian Kingsbury Wei Zhang Zoltán Tüske K. Gopalakrishnan MQ 26 21 0 27 Aug 2021
Reducing Exposure Bias in Training Recurrent Neural Network Transducers Xiaodong Cui Brian Kingsbury G. Saon David Haws Zoltán Tüske 19 5 0 24 Aug 2021
Integrating Dialog History into End-to-End Spoken Language Understanding Systems Jatin Ganhotra Samuel Thomas H. Kuo Sachindra Joshi G. Saon Zoltán Tüske Brian Kingsbury 30 10 0 18 Aug 2021
Learning a Neural Diff for Speech Models J. Macoskey Grant P. Strimel Ariya Rastrow 18 2 0 03 Aug 2021
SVEva Fair: A Framework for Evaluating Fairness in Speaker Verification Wiebke Toussaint Aaron Yi Ding 27 10 0 26 Jul 2021
Comparing Supervised Models And Learned Speech Representations For Classifying Intelligibility Of Disordered Speech On Selected Phrases Subhashini Venugopalan Joel Shor Manoj Plakal Jimmy Tobin Katrin Tomanek Jordan R. Green Michael P. Brenner 27 12 0 08 Jul 2021
On-Device Personalization of Automatic Speech Recognition Models for Disordered Speech Katrin Tomanek Franccoise Beaufays Julie Cattiau Angad Chandorkar K. Sim 21 15 0 18 Jun 2021
Collaborative Training of Acoustic Encoders for Speech Recognition Varun K. Nagaraja Yangyang Shi Ganesh Venkatesh Ozlem Kalinli M. Seltzer Vikas Chandra 43 11 0 16 Jun 2021
SynthASR: Unlocking Synthetic Data for Speech Recognition A. Fazel Wei Yang Yulan Liu Roberto Barra-Chicote Yi Meng Roland Maas J. Droppo SyDa 18 48 0 14 Jun 2021
End-to-end Neural Diarization: From Transformer to Conformer Yi Y. Liu Eunjung Han Chul Lee A. Stolcke 22 40 0 14 Jun 2021
Scaling End-to-End Models for Large-Scale Multilingual ASR Bo-wen Li Ruoming Pang Tara N. Sainath Anmol Gulati Yu Zhang James Qin Parisa Haghani Yifan Jiang Min Ma Junwen Bai CLL 34 76 0 30 Apr 2021
Advanced Long-context End-to-end Speech Recognition Using Context-expanded Transformers Takaaki Hori Niko Moritz Chiori Hori Jonathan Le Roux 30 34 0 19 Apr 2021