EESEN: End-to-End Speech Recognition using Deep RNN Models and WFST-based Decoding

29 July 2015

Papers citing "EESEN: End-to-End Speech Recognition using Deep RNN Models and WFST-based Decoding"

50 / 264 papers shown

Title
Streaming parallel transducer beam search with fast-slow cascaded encoders Jay Mahadeokar Yangyang Shi Ke Li Duc Le Jiedan Zhu Vikas Chandra Ozlem Kalinli M. Seltzer 35 15 0 29 Mar 2022
Dynamic Latency for CTC-Based Streaming Automatic Speech Recognition With Emformer J. Sun Guiping Zhong Dinghao Zhou Baoxiang Li 21 0 0 29 Mar 2022
Locality Matters: A Locality-Biased Linear Attention for Automatic Speech Recognition J. Sun Guiping Zhong Dinghao Zhou Baoxiang Li Yiran Zhong 28 7 0 29 Mar 2022
Saving RNN Computations with a Neuron-Level Fuzzy Memoization Scheme Franyell Silfa J. Arnau Antonio González 27 1 0 14 Feb 2022
Improving Automatic Speech Recognition for Non-Native English with Transfer Learning and Language Model Decoding Peter Sullivan Toshiko Shibano Muhammad Abdul-Mageed 41 11 0 10 Feb 2022
Star Temporal Classification: Sequence Classification with Partially Labeled Data Vineel Pratap Awni Y. Hannun Gabriel Synnaeve R. Collobert 23 8 0 28 Jan 2022
LiteLSTM Architecture for Deep Recurrent Neural Networks Nelly Elsayed Zag ElSayed Anthony Maida 40 5 0 27 Jan 2022
Large-Scale Inventory Optimization: A Recurrent-Neural-Networks-Inspired Simulation Approach T. Wan L. Hong 19 10 0 15 Jan 2022
A Survey on Adversarial Attacks for Malware Analysis Kshitiz Aryal Maanak Gupta Mahmoud Abdelsalam AAML 34 49 0 16 Nov 2021
Recent Advances in End-to-End Automatic Speech Recognition Jinyu Li VLM 35 363 0 02 Nov 2021
Speech Emotion Recognition Using Quaternion Convolutional Neural Networks Aneesh Muppidi Martin H. Radfar 25 46 0 31 Oct 2021
Combining Unsupervised and Text Augmented Semi-Supervised Learning for Low Resourced Autoregressive Speech Recognition Chak-Fai Li Francis Keith William Hartmann M. Snover SSL 21 2 0 29 Oct 2021
A Unified Speaker Adaptation Approach for ASR Yingzhu Zhao Chongjia Ni C. Leung Chenyu You Chng Eng Siong B. Ma CLL 92 9 0 16 Oct 2021
WenetSpeech: A 10000+ Hours Multi-domain Mandarin Corpus for Speech Recognition Binbin Zhang Hang Lv Pengcheng Guo Qijie Shao Chao Yang ... Hui Bu Xiaoyu Chen Chenchen Zeng Di Wu Zhendong Peng 25 219 0 07 Oct 2021
CTC Variations Through New WFST Topologies A. Laptev Somshubra Majumdar Boris Ginsburg 34 20 0 06 Oct 2021
Differentiable Allophone Graphs for Language-Universal Speech Recognition Brian Yan Siddharth Dalmia David R. Mortensen Florian Metze Shinji Watanabe 24 11 0 24 Jul 2021
End-to-End Rich Transcription-Style Automatic Speech Recognition with Semi-Supervised Learning Tomohiro Tanaka Ryo Masumura Mana Ihori Akihiko Takashima Shota Orihashi Naoki Makishima 19 4 0 07 Jul 2021
Cross-Modal Transformer-Based Neural Correction Models for Automatic Speech Recognition Tomohiro Tanaka Ryo Masumura Mana Ihori Akihiko Takashima Takafumi Moriya Takanori Ashihara Shota Orihashi Naoki Makishima 16 7 0 04 Jul 2021
What do End-to-End Speech Models Learn about Speaker, Language and Channel Information? A Layer-wise and Neuron-level Analysis Shammur A. Chowdhury Nadir Durrani Ahmed M. Ali 41 12 0 01 Jul 2021
Multi-mode Transformer Transducer with Stochastic Future Context Kwangyoun Kim Felix Wu Prashant Sridhar Kyu Jeong Han Shinji Watanabe 30 9 0 17 Jun 2021
Why does CTC result in peaky behavior? Albert Zeyer Ralf Schluter Hermann Ney 22 44 0 31 May 2021
On Addressing Practical Challenges for RNN-Transducer Rui Zhao Jian Xue Jinyu Li Wenning Wei Lei He Jiawei Liu 25 30 0 27 Apr 2021
WNARS: WFST based Non-autoregressive Streaming End-to-End Speech Recognition Zhichao Wang Wenwen Yang Pan Zhou Wei Chen RALM 34 17 0 08 Apr 2021
Dynamic Encoder Transducer: A Flexible Solution For Trading Off Accuracy For Latency Yangyang Shi Varun K. Nagaraja Chunyang Wu Jay Mahadeokar Duc Le ... Ching-Feng Yeh Julian Chan Christian Fuegen Ozlem Kalinli M. Seltzer 27 15 0 05 Apr 2021
AgentFormer: Agent-Aware Transformers for Socio-Temporal Multi-Agent Forecasting Ye Yuan Xinshuo Weng Yanglan Ou Kris Kitani AI4TS 45 442 0 25 Mar 2021
Fast End-to-End Speech Recognition via Non-Autoregressive Models and Cross-Modal Knowledge Transferring from BERT Ye Bai Jiangyan Yi J. Tao Zhengkun Tian Zhengqi Wen Shuai Zhang RALM 33 51 0 15 Feb 2021
Intermediate Loss Regularization for CTC-based Speech Recognition Jaesong Lee Shinji Watanabe 118 135 0 05 Feb 2021
Fine-tuning Handwriting Recognition systems with Temporal Dropout Edgard Chammas C. Mokbel 13 3 0 31 Jan 2021
Arabic aspect based sentiment analysis using bidirectional GRU based models Mohammed Mustafa T. H. Soliman A. Taloba Mohammed Fawzi Seedik 15 76 0 23 Jan 2021
Tiny Transducer: A Highly-efficient Speech Recognition Model on Edge Devices Yuekai Zhang Sining Sun Long Ma 35 28 0 18 Jan 2021
DeCoAR 2.0: Deep Contextualized Acoustic Representations with Vector Quantization Shaoshi Ling Yuzong Liu 26 106 0 11 Dec 2020
End to End ASR System with Automatic Punctuation Insertion Yushi Guan 3DV 27 5 0 03 Dec 2020
Disentangling Homophemes in Lip Reading using Perplexity Analysis Souheil Fenghour Daqing Chen Kun Guo Perry Xiao 31 3 0 28 Nov 2020
Streaming end-to-end multi-talker speech recognition Liang Lu Naoyuki Kanda Jinyu Li Jiawei Liu 13 41 0 26 Nov 2020
STEPs-RL: Speech-Text Entanglement for Phonetically Sound Representation Learning Prakamya Mishra 18 0 0 23 Nov 2020
WaDeNet: Wavelet Decomposition based CNN for Speech Processing P. Suresh Abhijith Ragav 21 0 0 11 Nov 2020
Non-Autoregressive Predictive Coding for Learning Speech Representations from Local Dependencies Alexander H. Liu Yu-An Chung James R. Glass SSL 27 87 0 01 Nov 2020
Reduce and Reconstruct: ASR for Low-Resource Phonetic Languages Anuj Diwan P. Jyothi 11 5 0 19 Oct 2020
E-BATCH: Energy-Efficient and High-Throughput RNN Batching Franyell Silfa J. Arnau Antonio González 22 11 0 22 Sep 2020
Orientation-aware Vehicle Re-identification with Semantics-guided Part Attention Network Tsai-Shien Chen Chih-Ting Liu Chih-Wei Wu Shao-Yi Chien 3DPC 172 85 0 26 Aug 2020
Adaptation Algorithms for Neural Network-Based Speech Recognition: An Overview P. Bell Joachim Fainberg Ondˇrej Klejch Jinyu Li Steve Renals P. Swietojanski 46 74 0 14 Aug 2020
Modular End-to-end Automatic Speech Recognition Framework for Acoustic-to-word Model Qi Liu Zhehuai Chen Hao Li Mingkun Huang Yizhou Lu Kai Yu 21 6 0 31 Jul 2020
Developing RNN-T Models Surpassing High-Performance Hybrid Models with Customization Capability Jinyu Li Rui Zhao Zhong Meng Yanqing Liu Wenning Wei ... V. Mazalov Zhenghao Wang Lei He Sheng Zhao Jiawei Liu 18 107 0 30 Jul 2020
Fully Convolutional Networks for Continuous Sign Language Recognition Ka Leong Cheng Zhaoyang Yang Qifeng Chen Yu-Wing Tai SLR 44 143 0 24 Jul 2020
Hardware Acceleration of Sparse and Irregular Tensor Computations of ML Models: A Survey and Insights Shail Dave Riyadh Baghdadi Tony Nowatzki Sasikanth Avancha Aviral Shrivastava Baoxin Li 59 82 0 02 Jul 2020
Streaming Transformer ASR with Blockwise Synchronous Beam Search E. Tsunoo Yosuke Kashiwagi Shinji Watanabe 22 11 0 25 Jun 2020
A Heuristically Self-Organised Linguistic Attribute Deep Learning in Edge Computing For IoT Intelligence Hongmei He Zhenhuan Zhu 9 1 0 08 Jun 2020
On the Comparison of Popular End-to-End Models for Large Scale Speech Recognition Jinyu Li Yu-Huan Wu Yashesh Gaur Chengyi Wang Rui Zhao Shujie Liu 17 133 0 28 May 2020
CAT: A CTC-CRF based ASR Toolkit Bridging the Hybrid and the End-to-end Approaches towards Data Efficiency and Low Latency Keyu An Hongyu Xiang Zhijian Ou 14 18 0 27 May 2020
A systematic comparison of grapheme-based vs. phoneme-based label units for encoder-decoder-attention models Mohammad Zeineldeen Albert Zeyer Wei Zhou T. Ng Ralf Schluter Hermann Ney 22 2 0 19 May 2020