ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1706.02737
  4. Cited By
Advances in Joint CTC-Attention based End-to-End Speech Recognition with
  a Deep CNN Encoder and RNN-LM

Advances in Joint CTC-Attention based End-to-End Speech Recognition with a Deep CNN Encoder and RNN-LM

Interspeech (Interspeech), 2017
8 June 2017
Takaaki Hori
Shinji Watanabe
Yu Zhang
William Chan
ArXiv (abs)PDFHTML

Papers citing "Advances in Joint CTC-Attention based End-to-End Speech Recognition with a Deep CNN Encoder and RNN-LM"

50 / 124 papers shown
Title
Adversarial Meta Sampling for Multilingual Low-Resource Speech
  Recognition
Adversarial Meta Sampling for Multilingual Low-Resource Speech RecognitionAAAI Conference on Artificial Intelligence (AAAI), 2020
Yubei Xiao
Ke Gong
Pan Zhou
Guolin Zheng
Xiaodan Liang
Liang Lin
160
35
0
22 Dec 2020
End to End ASR System with Automatic Punctuation Insertion
End to End ASR System with Automatic Punctuation Insertion
Yushi Guan
3DV
90
6
0
03 Dec 2020
Stochastic Attention Head Removal: A simple and effective method for
  improving Transformer Based ASR Models
Stochastic Attention Head Removal: A simple and effective method for improving Transformer Based ASR Models
Shucong Zhang
Erfan Loweimi
P. Bell
Steve Renals
190
0
0
08 Nov 2020
Unsupervised Pattern Discovery from Thematic Speech Archives Based on
  Multilingual Bottleneck Features
Unsupervised Pattern Discovery from Thematic Speech Archives Based on Multilingual Bottleneck Features
Man-Ling Sung
Siyuan Feng
Tan Lee
86
4
0
03 Nov 2020
Semi-Supervised Speech Recognition via Graph-based Temporal
  Classification
Semi-Supervised Speech Recognition via Graph-based Temporal ClassificationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2020
Niko Moritz
Takaaki Hori
Jonathan Le Roux
245
30
0
29 Oct 2020
KoSpeech: Open-Source Toolkit for End-to-End Korean Speech Recognition
KoSpeech: Open-Source Toolkit for End-to-End Korean Speech Recognition
Soohwan Kim
Seyoung Bae
Cheolhwang Won
VLM
155
4
0
07 Sep 2020
Limited-angle tomographic reconstruction of dense layered objects by
  dynamical machine learning
Limited-angle tomographic reconstruction of dense layered objects by dynamical machine learning
Iksung Kang
A. Goy
George Barbastathis
AI4CE
99
23
0
21 Jul 2020
The ASRU 2019 Mandarin-English Code-Switching Speech Recognition
  Challenge: Open Datasets, Tracks, Methods and Results
The ASRU 2019 Mandarin-English Code-Switching Speech Recognition Challenge: Open Datasets, Tracks, Methods and Results
Xian Shi
Qiangze Feng
Lei Xie
148
53
0
12 Jul 2020
CAT: A CTC-CRF based ASR Toolkit Bridging the Hybrid and the End-to-end
  Approaches towards Data Efficiency and Low Latency
CAT: A CTC-CRF based ASR Toolkit Bridging the Hybrid and the End-to-end Approaches towards Data Efficiency and Low LatencyInterspeech (Interspeech), 2020
Keyu An
Hongyu Xiang
Zhijian Ou
196
24
0
27 May 2020
A New Training Pipeline for an Improved Neural Transducer
A New Training Pipeline for an Improved Neural Transducer
Albert Zeyer
André Merboldt
Ralf Schluter
Hermann Ney
AI4TSMedIm
195
53
0
19 May 2020
End-to-end Whispered Speech Recognition with Frequency-weighted
  Approaches and Pseudo Whisper Pre-training
End-to-end Whispered Speech Recognition with Frequency-weighted Approaches and Pseudo Whisper Pre-training
Heng-Jui Chang
Alexander H. Liu
Hung-yi Lee
Lin-Shan Lee
137
2
0
05 May 2020
Fast and Accurate Deep Bidirectional Language Representations for
  Unsupervised Learning
Fast and Accurate Deep Bidirectional Language Representations for Unsupervised LearningAnnual Meeting of the Association for Computational Linguistics (ACL), 2020
Joongbo Shin
Yoonhyung Lee
Seunghyun Yoon
Kyomin Jung
OOD
141
12
0
17 Apr 2020
Learning Fast Adaptation on Cross-Accented Speech Recognition
Learning Fast Adaptation on Cross-Accented Speech RecognitionInterspeech (Interspeech), 2020
Genta Indra Winata
Samuel Cahyawijaya
Zihan Liu
Mohammad Kachuee
Andrea Madotto
Peng Xu
Pascale Fung
135
86
0
04 Mar 2020
End-to-End Automatic Speech Recognition Integrated With CTC-Based Voice
  Activity Detection
End-to-End Automatic Speech Recognition Integrated With CTC-Based Voice Activity DetectionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2020
Takenori Yoshimura
Tomoki Hayashi
K. Takeda
Shinji Watanabe
182
55
0
03 Feb 2020
Streaming automatic speech recognition with the transformer model
Streaming automatic speech recognition with the transformer modelIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2020
Niko Moritz
Takaaki Hori
Jonathan Le Roux
354
198
0
08 Jan 2020
Improved Multi-Stage Training of Online Attention-based Encoder-Decoder
  Models
Improved Multi-Stage Training of Online Attention-based Encoder-Decoder ModelsAutomatic Speech Recognition & Understanding (ASRU), 2019
Abhinav Garg
Dhananjaya N. Gowda
Ankur Kumar
Kwangyoun Kim
Mehul Kumar
Chanwoo Kim
3DV
92
15
0
28 Dec 2019
Application of Word2vec in Phoneme Recognition
Application of Word2vec in Phoneme RecognitionInternational Conference on Machine Learning and Computing (ICMLC), 2019
Xin Feng
Lei Wang
101
3
0
17 Dec 2019
Independent language modeling architecture for end-to-end ASR
Independent language modeling architecture for end-to-end ASRIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2019
Van Tung Pham
Haihua Xu
Yerbolat Khassanov
Zhiping Zeng
Chng Eng Siong
Chongjia Ni
B. Ma
Haizhou Li
AuLLM
128
16
0
25 Nov 2019
On Using SpecAugment for End-to-End Speech Translation
On Using SpecAugment for End-to-End Speech TranslationInternational Workshop on Spoken Language Translation (IWSLT), 2019
Parnia Bahar
Albert Zeyer
Ralf Schluter
Hermann Ney
186
56
0
20 Nov 2019
CAT: CRF-based ASR Toolkit
CAT: CRF-based ASR Toolkit
Keyu An
Hongyu Xiang
Zhijian Ou
88
7
0
20 Nov 2019
What does a network layer hear? Analyzing hidden representations of
  end-to-end ASR through speech synthesis
What does a network layer hear? Analyzing hidden representations of end-to-end ASR through speech synthesisIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2019
Chung-Yi Li
Pei-Chieh Yuan
Hung-yi Lee
317
32
0
04 Nov 2019
Lightweight and Efficient End-to-End Speech Recognition Using Low-Rank
  Transformer
Lightweight and Efficient End-to-End Speech Recognition Using Low-Rank TransformerIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2019
Genta Indra Winata
Samuel Cahyawijaya
Mohammad Kachuee
Zihan Liu
Pascale Fung
201
86
0
30 Oct 2019
Sequence-to-sequence Automatic Speech Recognition with Word Embedding
  Regularization and Fused Decoding
Sequence-to-sequence Automatic Speech Recognition with Word Embedding Regularization and Fused DecodingIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2019
Alexander H. Liu
Tzu-Wei Sung
Shun-Po Chuang
Hung-yi Lee
Lin-Shan Lee
158
13
0
28 Oct 2019
Exploring Lexicon-Free Modeling Units for End-to-End Korean and
  Korean-English Code-Switching Speech Recognition
Exploring Lexicon-Free Modeling Units for End-to-End Korean and Korean-English Code-Switching Speech RecognitionInterspeech (Interspeech), 2019
Jisung Wang
Jihwan Kim
Sangki Kim
Yeha Lee
93
5
0
25 Oct 2019
Recognizing long-form speech using streaming end-to-end models
Recognizing long-form speech using streaming end-to-end modelsAutomatic Speech Recognition & Understanding (ASRU), 2019
A. Narayanan
Rohit Prabhavalkar
Chung-Cheng Chiu
David Rybach
Tara N. Sainath
Trevor Strohman
160
135
0
24 Oct 2019
A practical two-stage training strategy for multi-stream end-to-end
  speech recognition
A practical two-stage training strategy for multi-stream end-to-end speech recognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2019
Ruizhi Li
Gregory Sell
Xiaofei Wang
Shinji Watanabe
H. Hermansky
80
7
0
23 Oct 2019
End-to-End Speech Recognition: A review for the French Language
End-to-End Speech Recognition: A review for the French Language
Florian Boyer
Jean-Luc Rouas
AI4TS
141
10
0
18 Oct 2019
A Comparative Study on Transformer vs RNN in Speech Applications
A Comparative Study on Transformer vs RNN in Speech ApplicationsAutomatic Speech Recognition & Understanding (ASRU), 2019
Shigeki Karita
Nanxin Chen
Tomoki Hayashi
Takaaki Hori
Hirofumi Inaguma
...
Ryuichi Yamamoto
Xiao-fei Wang
Shinji Watanabe
Takenori Yoshimura
Wangyou Zhang
252
778
0
13 Sep 2019
Deep learning networks for selection of persistent scatterer pixels in
  multi-temporal SAR interferometric processing
Deep learning networks for selection of persistent scatterer pixels in multi-temporal SAR interferometric processing
A. Tiwari
Avadh Bihari Narayan
O. Dikshit
86
1
0
04 Sep 2019
Exploiting Cross-Lingual Speaker and Phonetic Diversity for Unsupervised
  Subword Modeling
Exploiting Cross-Lingual Speaker and Phonetic Diversity for Unsupervised Subword ModelingIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2019
Siyuan Feng
Tan Lee
128
10
0
09 Aug 2019
Cross-Attention End-to-End ASR for Two-Party Conversations
Cross-Attention End-to-End ASR for Two-Party ConversationsInterspeech (Interspeech), 2019
Suyoun Kim
Siddharth Dalmia
Florian Metze
194
19
0
24 Jul 2019
End-to-End Speech Recognition with High-Frame-Rate Features Extraction
End-to-End Speech Recognition with High-Frame-Rate Features Extraction
Cong-Thanh Do
3DV
48
0
0
03 Jul 2019
Gated Embeddings in End-to-End Speech Recognition for
  Conversational-Context Fusion
Gated Embeddings in End-to-End Speech Recognition for Conversational-Context FusionAnnual Meeting of the Association for Computational Linguistics (ACL), 2019
Suyoun Kim
Siddharth Dalmia
Florian Metze
164
24
0
27 Jun 2019
Self Multi-Head Attention for Speaker Recognition
Self Multi-Head Attention for Speaker RecognitionInterspeech (Interspeech), 2019
Miquel India
Pooyan Safari
Javier Hernando
125
115
0
24 Jun 2019
Multi-Stream End-to-End Speech Recognition
Multi-Stream End-to-End Speech RecognitionIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2019
Ruizhi Li
Xiaofei Wang
Sri Harish Reddy Mallidi
Shinji Watanabe
Takaaki Hori
H. Hermansky
169
23
0
17 Jun 2019
Acoustic-to-Word Models with Conversational Context Information
Acoustic-to-Word Models with Conversational Context InformationNorth American Chapter of the Association for Computational Linguistics (NAACL), 2019
Suyoun Kim
Florian Metze
123
7
0
21 May 2019
Semi-supervised Sequence-to-sequence ASR using Unpaired Speech and Text
Semi-supervised Sequence-to-sequence ASR using Unpaired Speech and Text
M. Baskar
Shinji Watanabe
Ramón Fernández Astudillo
Takaaki Hori
L. Burget
J. Černocký
183
40
0
30 Apr 2019
Performance Monitoring for End-to-End Speech Recognition
Performance Monitoring for End-to-End Speech Recognition
Ruizhi Li
Gregory Sell
H. Hermansky
75
2
0
09 Apr 2019
Constrained Output Embeddings for End-to-End Code-Switching Speech
  Recognition with Only Monolingual Data
Constrained Output Embeddings for End-to-End Code-Switching Speech Recognition with Only Monolingual Data
Yerbolat Khassanov
Haihua Xu
Van Tung Pham
Zhiping Zeng
Chng Eng Siong
Chongjia Ni
B. Ma
151
20
0
08 Apr 2019
Massively Multilingual Adversarial Speech Recognition
Massively Multilingual Adversarial Speech Recognition
Oliver Adams
Sanjeev Khudanpur
Shinji Watanabe
David Yarowsky
140
79
0
03 Apr 2019
Learning Shared Encoding Representation for End-to-End Speech
  Recognition Models
Learning Shared Encoding Representation for End-to-End Speech Recognition Models
T. Nguyen
Sebastian Stüker
A. Waibel
138
2
0
31 Mar 2019
Self-Attention Networks for Connectionist Temporal Classification in
  Speech Recognition
Self-Attention Networks for Connectionist Temporal Classification in Speech Recognition
Julian Salazar
Katrin Kirchhoff
Zhiheng Huang
AI4TS
239
123
0
22 Jan 2019
Image retrieval method based on CNN and dimension reduction
Image retrieval method based on CNN and dimension reduction
Zhihao Cao
Shaomin Mu
Yongyu Xu
Mengping Dong
67
8
0
13 Jan 2019
Advancing Acoustic-to-Word CTC Model with Attention and Mixed-Units
Advancing Acoustic-to-Word CTC Model with Attention and Mixed-Units
Amit Das
Jinyu Li
Guoli Ye
Rui Zhao
Jiawei Liu
132
26
0
31 Dec 2018
Stream attention-based multi-array end-to-end speech recognition
Stream attention-based multi-array end-to-end speech recognition
Xiaofei Wang
Ruizhi Li
Sri Harish Reddy Mallidi
Takaaki Hori
Shinji Watanabe
H. Hermansky
129
21
0
12 Nov 2018
Multi-encoder multi-resolution framework for end-to-end speech
  recognition
Multi-encoder multi-resolution framework for end-to-end speech recognition
Ruizhi Li
Xiaofei Wang
Sri Harish Reddy Mallidi
Takaaki Hori
Shinji Watanabe
H. Hermansky
110
13
0
12 Nov 2018
Vectorization of hypotheses and speech for faster beam search in encoder
  decoder-based speech recognition
Vectorization of hypotheses and speech for faster beam search in encoder decoder-based speech recognition
Hiroshi Seki
Takaaki Hori
Shinji Watanabe
106
2
0
12 Nov 2018
Few-shot learning with attention-based sequence-to-sequence models
Few-shot learning with attention-based sequence-to-sequence models
Bertrand Higy
P. Bell
96
7
0
08 Nov 2018
CNN-based MultiChannel End-to-End Speech Recognition for everyday home
  environments
CNN-based MultiChannel End-to-End Speech Recognition for everyday home environmentsEuropean Signal Processing Conference (EUSIPCO), 2018
Hyungjun Lim
Younggwan Kim
Takaaki Hori
Myunghun Jung
Hoirin Kim
144
12
0
07 Nov 2018
Language model integration based on memory control for sequence to sequence speech recognition
Language model integration based on memory control for sequence to sequence speech recognition
Aaron Springer
Shinji Watanabe
Takaaki Hori
M. Baskar
Hirofumi Inaguma
Jesus Villalba
Najim Dehak
KELM
163
6
0
06 Nov 2018
Previous
123
Next