Advances in Joint CTC-Attention based End-to-End Speech Recognition with a Deep CNN Encoder and RNN-LM

Interspeech (Interspeech), 2017

8 June 2017

Papers citing "Advances in Joint CTC-Attention based End-to-End Speech Recognition with a Deep CNN Encoder and RNN-LM"

50 / 124 papers shown

Title
Adversarial Meta Sampling for Multilingual Low-Resource Speech RecognitionAAAI Conference on Artificial Intelligence (AAAI), 2020 Yubei Xiao Ke Gong Pan Zhou Guolin Zheng Xiaodan Liang Liang Lin 160 35 0 22 Dec 2020
End to End ASR System with Automatic Punctuation Insertion Yushi Guan 3DV 90 6 0 03 Dec 2020
Stochastic Attention Head Removal: A simple and effective method for improving Transformer Based ASR Models Shucong Zhang Erfan Loweimi P. Bell Steve Renals 190 0 0 08 Nov 2020
Unsupervised Pattern Discovery from Thematic Speech Archives Based on Multilingual Bottleneck Features Man-Ling Sung Siyuan Feng Tan Lee 86 4 0 03 Nov 2020
Semi-Supervised Speech Recognition via Graph-based Temporal ClassificationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2020 Niko Moritz Takaaki Hori Jonathan Le Roux 245 30 0 29 Oct 2020
KoSpeech: Open-Source Toolkit for End-to-End Korean Speech Recognition Soohwan Kim Seyoung Bae Cheolhwang Won VLM 155 4 0 07 Sep 2020
Limited-angle tomographic reconstruction of dense layered objects by dynamical machine learning Iksung Kang A. Goy George Barbastathis AI4CE 99 23 0 21 Jul 2020
The ASRU 2019 Mandarin-English Code-Switching Speech Recognition Challenge: Open Datasets, Tracks, Methods and Results Xian Shi Qiangze Feng Lei Xie 148 53 0 12 Jul 2020
CAT: A CTC-CRF based ASR Toolkit Bridging the Hybrid and the End-to-end Approaches towards Data Efficiency and Low LatencyInterspeech (Interspeech), 2020 Keyu An Hongyu Xiang Zhijian Ou 196 24 0 27 May 2020
A New Training Pipeline for an Improved Neural Transducer Albert Zeyer André Merboldt Ralf Schluter Hermann Ney AI4TS MedIm 195 53 0 19 May 2020
End-to-end Whispered Speech Recognition with Frequency-weighted Approaches and Pseudo Whisper Pre-training Heng-Jui Chang Alexander H. Liu Hung-yi Lee Lin-Shan Lee 137 2 0 05 May 2020
Fast and Accurate Deep Bidirectional Language Representations for Unsupervised LearningAnnual Meeting of the Association for Computational Linguistics (ACL), 2020 Joongbo Shin Yoonhyung Lee Seunghyun Yoon Kyomin Jung OOD 141 12 0 17 Apr 2020
Learning Fast Adaptation on Cross-Accented Speech RecognitionInterspeech (Interspeech), 2020 Genta Indra Winata Samuel Cahyawijaya Zihan Liu Mohammad Kachuee Andrea Madotto Peng Xu Pascale Fung 135 86 0 04 Mar 2020
End-to-End Automatic Speech Recognition Integrated With CTC-Based Voice Activity DetectionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2020 Takenori Yoshimura Tomoki Hayashi K. Takeda Shinji Watanabe 182 55 0 03 Feb 2020
Streaming automatic speech recognition with the transformer modelIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2020 Niko Moritz Takaaki Hori Jonathan Le Roux 354 198 0 08 Jan 2020
Improved Multi-Stage Training of Online Attention-based Encoder-Decoder ModelsAutomatic Speech Recognition & Understanding (ASRU), 2019 Abhinav Garg Dhananjaya N. Gowda Ankur Kumar Kwangyoun Kim Mehul Kumar Chanwoo Kim 3DV 92 15 0 28 Dec 2019
Application of Word2vec in Phoneme RecognitionInternational Conference on Machine Learning and Computing (ICMLC), 2019 Xin Feng Lei Wang 101 3 0 17 Dec 2019
Independent language modeling architecture for end-to-end ASRIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2019 Van Tung Pham Haihua Xu Yerbolat Khassanov Zhiping Zeng Chng Eng Siong Chongjia Ni B. Ma Haizhou Li AuLLM 128 16 0 25 Nov 2019
On Using SpecAugment for End-to-End Speech TranslationInternational Workshop on Spoken Language Translation (IWSLT), 2019 Parnia Bahar Albert Zeyer Ralf Schluter Hermann Ney 186 56 0 20 Nov 2019
CAT: CRF-based ASR Toolkit Keyu An Hongyu Xiang Zhijian Ou 88 7 0 20 Nov 2019
What does a network layer hear? Analyzing hidden representations of end-to-end ASR through speech synthesisIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2019 Chung-Yi Li Pei-Chieh Yuan Hung-yi Lee 317 32 0 04 Nov 2019
Lightweight and Efficient End-to-End Speech Recognition Using Low-Rank TransformerIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2019 Genta Indra Winata Samuel Cahyawijaya Mohammad Kachuee Zihan Liu Pascale Fung 201 86 0 30 Oct 2019
Sequence-to-sequence Automatic Speech Recognition with Word Embedding Regularization and Fused DecodingIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2019 Alexander H. Liu Tzu-Wei Sung Shun-Po Chuang Hung-yi Lee Lin-Shan Lee 158 13 0 28 Oct 2019
Exploring Lexicon-Free Modeling Units for End-to-End Korean and Korean-English Code-Switching Speech RecognitionInterspeech (Interspeech), 2019 Jisung Wang Jihwan Kim Sangki Kim Yeha Lee 93 5 0 25 Oct 2019
Recognizing long-form speech using streaming end-to-end modelsAutomatic Speech Recognition & Understanding (ASRU), 2019 A. Narayanan Rohit Prabhavalkar Chung-Cheng Chiu David Rybach Tara N. Sainath Trevor Strohman 160 135 0 24 Oct 2019
A practical two-stage training strategy for multi-stream end-to-end speech recognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2019 Ruizhi Li Gregory Sell Xiaofei Wang Shinji Watanabe H. Hermansky 80 7 0 23 Oct 2019
End-to-End Speech Recognition: A review for the French Language Florian Boyer Jean-Luc Rouas AI4TS 141 10 0 18 Oct 2019
A Comparative Study on Transformer vs RNN in Speech ApplicationsAutomatic Speech Recognition & Understanding (ASRU), 2019 Shigeki Karita Nanxin Chen Tomoki Hayashi Takaaki Hori Hirofumi Inaguma ... Ryuichi Yamamoto Xiao-fei Wang Shinji Watanabe Takenori Yoshimura Wangyou Zhang 252 778 0 13 Sep 2019
Deep learning networks for selection of persistent scatterer pixels in multi-temporal SAR interferometric processing A. Tiwari Avadh Bihari Narayan O. Dikshit 86 1 0 04 Sep 2019
Exploiting Cross-Lingual Speaker and Phonetic Diversity for Unsupervised Subword ModelingIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2019 Siyuan Feng Tan Lee 128 10 0 09 Aug 2019
Cross-Attention End-to-End ASR for Two-Party ConversationsInterspeech (Interspeech), 2019 Suyoun Kim Siddharth Dalmia Florian Metze 194 19 0 24 Jul 2019
End-to-End Speech Recognition with High-Frame-Rate Features Extraction Cong-Thanh Do 3DV 48 0 0 03 Jul 2019
Gated Embeddings in End-to-End Speech Recognition for Conversational-Context FusionAnnual Meeting of the Association for Computational Linguistics (ACL), 2019 Suyoun Kim Siddharth Dalmia Florian Metze 164 24 0 27 Jun 2019
Self Multi-Head Attention for Speaker RecognitionInterspeech (Interspeech), 2019 Miquel India Pooyan Safari Javier Hernando 125 115 0 24 Jun 2019
Multi-Stream End-to-End Speech RecognitionIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2019 Ruizhi Li Xiaofei Wang Sri Harish Reddy Mallidi Shinji Watanabe Takaaki Hori H. Hermansky 169 23 0 17 Jun 2019
Acoustic-to-Word Models with Conversational Context InformationNorth American Chapter of the Association for Computational Linguistics (NAACL), 2019 Suyoun Kim Florian Metze 123 7 0 21 May 2019
Semi-supervised Sequence-to-sequence ASR using Unpaired Speech and Text M. Baskar Shinji Watanabe Ramón Fernández Astudillo Takaaki Hori L. Burget J. Černocký 183 40 0 30 Apr 2019
Performance Monitoring for End-to-End Speech Recognition Ruizhi Li Gregory Sell H. Hermansky 75 2 0 09 Apr 2019
Constrained Output Embeddings for End-to-End Code-Switching Speech Recognition with Only Monolingual Data Yerbolat Khassanov Haihua Xu Van Tung Pham Zhiping Zeng Chng Eng Siong Chongjia Ni B. Ma 151 20 0 08 Apr 2019
Massively Multilingual Adversarial Speech Recognition Oliver Adams Sanjeev Khudanpur Shinji Watanabe David Yarowsky 140 79 0 03 Apr 2019
Learning Shared Encoding Representation for End-to-End Speech Recognition Models T. Nguyen Sebastian Stüker A. Waibel 138 2 0 31 Mar 2019
Self-Attention Networks for Connectionist Temporal Classification in Speech Recognition Julian Salazar Katrin Kirchhoff Zhiheng Huang AI4TS 239 123 0 22 Jan 2019
Image retrieval method based on CNN and dimension reduction Zhihao Cao Shaomin Mu Yongyu Xu Mengping Dong 67 8 0 13 Jan 2019
Advancing Acoustic-to-Word CTC Model with Attention and Mixed-Units Amit Das Jinyu Li Guoli Ye Rui Zhao Jiawei Liu 132 26 0 31 Dec 2018
Stream attention-based multi-array end-to-end speech recognition Xiaofei Wang Ruizhi Li Sri Harish Reddy Mallidi Takaaki Hori Shinji Watanabe H. Hermansky 129 21 0 12 Nov 2018
Multi-encoder multi-resolution framework for end-to-end speech recognition Ruizhi Li Xiaofei Wang Sri Harish Reddy Mallidi Takaaki Hori Shinji Watanabe H. Hermansky 110 13 0 12 Nov 2018
Vectorization of hypotheses and speech for faster beam search in encoder decoder-based speech recognition Hiroshi Seki Takaaki Hori Shinji Watanabe 106 2 0 12 Nov 2018
Few-shot learning with attention-based sequence-to-sequence models Bertrand Higy P. Bell 96 7 0 08 Nov 2018
CNN-based MultiChannel End-to-End Speech Recognition for everyday home environmentsEuropean Signal Processing Conference (EUSIPCO), 2018 Hyungjun Lim Younggwan Kim Takaaki Hori Myunghun Jung Hoirin Kim 144 12 0 07 Nov 2018
Language model integration based on memory control for sequence to sequence speech recognition Aaron Springer Shinji Watanabe Takaaki Hori M. Baskar Hirofumi Inaguma Jesus Villalba Najim Dehak KELM 163 6 0 06 Nov 2018