Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1512.02595
Cited By
Deep Speech 2: End-to-End Speech Recognition in English and Mandarin
8 December 2015
Dario Amodei
Rishita Anubhai
Eric Battenberg
Carl Case
Jared Casper
Bryan Catanzaro
Jingdong Chen
Mike Chrzanowski
Adam Coates
G. Diamos
Erich Elsen
Jesse Engel
Linxi Fan
Christopher Fougner
T. Han
Awni Y. Hannun
Billy Jun
P. LeGresley
Libby Lin
Sharan Narang
A. Ng
Sherjil Ozair
R. Prenger
Jonathan Raiman
S. Satheesh
David Seetapun
Shubho Sengupta
Yi Wang
Zhiqian Wang
Chong-Jun Wang
Bo Xiao
Dani Yogatama
J. Zhan
Zhenyao Zhu
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Deep Speech 2: End-to-End Speech Recognition in English and Mandarin"
50 / 1,096 papers shown
Title
Joint Speech Recognition and Audio Captioning
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Chaitanya Narisetty
E. Tsunoo
Xuankai Chang
Yosuke Kashiwagi
Michael Hentschel
Shinji Watanabe
106
10
0
03 Feb 2022
Imperceptible and Multi-channel Backdoor Attack against Deep Neural Networks
Mingfu Xue
S. Ni
Ying-Chang Wu
Yushu Zhang
Jian Wang
Weiqiang Liu
AAML
180
18
0
31 Jan 2022
The Norwegian Parliamentary Speech Corpus
International Conference on Language Resources and Evaluation (LREC), 2022
Per Erik Solberg
Pablo Ortiz
98
15
0
26 Jan 2022
Internal Language Model Estimation Through Explicit Context Vector Learning for Attention-based Encoder-decoder ASR
Interspeech (Interspeech), 2022
Yufei Liu
Rao Ma
Haihua Xu
Yi He
Zejun Ma
Weibin Zhang
128
15
0
26 Jan 2022
Improved Mispronunciation detection system using a hybrid CTC-ATT based approach for L2 English speakers
Neha Baranwal
Sharatkumar Chilaka
92
3
0
25 Jan 2022
A Noise-Robust Self-supervised Pre-training Model Based Speech Representation Learning for Automatic Speech Recognition
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Qiu-shi Zhu
Jie Zhang
Zi-qiang Zhang
Ming Wu
Xin Fang
Lirong Dai
281
51
0
22 Jan 2022
Semantic-Aware Implicit Neural Audio-Driven Video Portrait Generation
European Conference on Computer Vision (ECCV), 2022
Xian Liu
Yinghao Xu
Qianyi Wu
Hang Zhou
Wayne Wu
Bolei Zhou
VGen
DiffM
3DH
173
161
0
19 Jan 2022
Transferability in Deep Learning: A Survey
Junguang Jiang
Yang Shu
Jianmin Wang
Mingsheng Long
OOD
189
128
0
15 Jan 2022
Robust Self-Supervised Audio-Visual Speech Recognition
Interspeech (Interspeech), 2022
Bowen Shi
Wei-Ning Hsu
Abdel-rahman Mohamed
273
115
0
05 Jan 2022
Discrete and continuous representations and processing in deep learning: Looking forward
AI Open (AO), 2022
Ruben Cartuyvels
Graham Spinks
Marie-Francine Moens
OCL
293
28
0
04 Jan 2022
DFA-NeRF: Personalized Talking Head Generation via Disentangled Face Attributes Neural Rendering
Shunyu Yao
Ruizhe Zhong
Manwen Liao
Guangtao Zhai
Xiaokang Yang
CVBM
147
113
0
03 Jan 2022
Making AI 'Smart': Bridging AI and Cognitive Science
Madhav Agarwal
Siddhant Bansal
145
0
0
31 Dec 2021
Towards Relatable Explainable AI with the Perceptual Process
International Conference on Human Factors in Computing Systems (CHI), 2021
Wencan Zhang
Brian Y. Lim
AAML
XAI
248
70
0
28 Dec 2021
Multi-Dialect Arabic Speech Recognition
IEEE International Joint Conference on Neural Network (IJCNN), 2020
Abbas Raza Ali
77
19
0
25 Dec 2021
Multi-Variant Consistency based Self-supervised Learning for Robust Automatic Speech Recognition
Changfeng Gao
Gaofeng Cheng
Pengyuan Zhang
245
4
0
23 Dec 2021
A Comprehensive Analytical Survey on Unsupervised and Semi-Supervised Graph Representation Learning Methods
Md. Khaledur Rahman
A. Azad
AI4TS
107
3
0
20 Dec 2021
Saliency Grafting: Innocuous Attribution-Guided Mixup with Calibrated Label Mixing
Joonhyung Park
J. Yang
Jinwoo Shin
Sung Ju Hwang
Eunho Yang
163
26
0
16 Dec 2021
On the Use of External Data for Spoken Named Entity Recognition
Ankita Pasad
Felix Wu
Suwon Shon
Karen Livescu
Kyu Jeong Han
115
17
0
14 Dec 2021
Real-Time Neural Voice Camouflage
Mia Chiquier
Chengzhi Mao
Carl Vondrick
162
8
0
14 Dec 2021
Perceptual Loss with Recognition Model for Single-Channel Enhancement and Robust ASR
Peter William VanHarn Plantinga
Deblin Bagchi
Eric Fosler-Lussier
174
10
0
11 Dec 2021
Are E2E ASR models ready for an industrial usage?
Valentin Vielzeuf
G. Antipov
274
8
0
09 Dec 2021
FastSGD: A Fast Compressed SGD Framework for Distributed Machine Learning
Keyu Yang
Lu Chen
Zhihao Zeng
Yunjun Gao
150
9
0
08 Dec 2021
A Transferable Approach for Partitioning Machine Learning Models on Multi-Chip-Modules
Xinfeng Xie
Prakash Prabhu
Ulysse Beaugnon
P. Phothilimthana
Sudip Roy
Azalia Mirhoseini
E. Brevdo
James Laudon
Yanqi Zhou
118
6
0
07 Dec 2021
Training end-to-end speech-to-text models on mobile phones
S. Zitha
Raghavendra Rao Suresh
Pooja S B. Rao
T. V. Prabhakar
142
1
0
07 Dec 2021
On Large Batch Training and Sharp Minima: A Fokker-Planck Perspective
Xiaowu Dai
Yuhua Zhu
124
8
0
02 Dec 2021
Automated Speech Scoring System Under The Lens: Evaluating and interpreting the linguistic cues for language proficiency
P. Bamdev
Manraj Singh Grover
Yaman Kumar Singla
Payman Vafaee
Mika Hama
R. Shah
146
16
0
30 Nov 2021
Factorized Fourier Neural Operators
International Conference on Learning Representations (ICLR), 2021
Alasdair Tran
A. Mathews
Lexing Xie
Cheng Soon Ong
AI4CE
367
219
0
27 Nov 2021
Romanian Speech Recognition Experiments from the ROBIN Project
Andrei-Marius Avram
Vasile Puaics
Dan Tufics
128
5
0
23 Nov 2021
Human-Machine Interaction Speech Corpus from the ROBIN project
International Conference on Speech Technology and Human-Computer Dialogue (ICSTHD), 2021
V. Pais
Radu Ion
Andrei-Marius Avram
Elena Irimia
V. Mititelu
Maria Mitrofan
120
7
0
22 Nov 2021
Denoised Internal Models: a Brain-Inspired Autoencoder against Adversarial Attacks
Machine Intelligence Research (MIR), 2021
Kaiyuan Liu
Xingyu Li
Yu-Rui Lai
Hong Xie
Hang Su
Jiacheng Wang
Chunxu Guo
J. Guan
Yi Zhou
AAML
234
4
0
21 Nov 2021
The People's Speech: A Large-Scale Diverse English Speech Recognition Dataset for Commercial Usage
Daniel Galvez
G. Diamos
Juan Ciro
Juan Felipe Cerón
Keith Achorn
Anjali Gopi
David Kanter
Maximilian Lam
Mark Mazumder
Vijay Janapa Reddi
211
122
0
17 Nov 2021
A Survey on Adversarial Attacks for Malware Analysis
IEEE Access (IEEE Access), 2021
Kshitiz Aryal
Maanak Gupta
Mahmoud Abdelsalam
AAML
258
64
0
16 Nov 2021
Recurrent Neural Networks for Learning Long-term Temporal Dependencies with Reanalysis of Time Scale Representation
International Conference on Big Knowledge (ICBK), 2021
Kentaro Ohno
Atsutoshi Kumagai
CLL
AI4CE
91
10
0
05 Nov 2021
Towards Learning to Speak and Hear Through Multi-Agent Communication over a Continuous Acoustic Channel
Kevin Eloff
Okko Räsänen
H. Engelbrecht
Arnu Pretorius
Herman Kamper
166
3
0
04 Nov 2021
Speech recognition for air traffic control via feature learning and end-to-end training
Peng Fan
Dongyue Guo
Yi Lin
Bo Yang
Jianwei Zhang
133
8
0
04 Nov 2021
RT-RCG: Neural Network and Accelerator Search Towards Effective and Real-time ECG Reconstruction from Intracardiac Electrograms
ACM Journal on Emerging Technologies in Computing Systems (JETC), 2021
Yongan Zhang
Anton Banta
Yonggan Fu
M. John
A. Post
M. Razavi
Joseph R. Cavallaro
B. Aazhang
Yingyan Lin
133
4
0
04 Nov 2021
Recent Advances in End-to-End Automatic Speech Recognition
APSIPA Transactions on Signal and Information Processing (TASIP), 2021
Jinyu Li
VLM
382
424
0
02 Nov 2021
EfficientWord-Net: An Open Source Hotword Detection Engine based on One-shot Learning
Journal of Information & Knowledge Management (JIKM), 2021
R. Chidhambararajan
Aman Rangaur
S. C. Sethuraman
129
6
0
31 Oct 2021
Beyond
L
p
L_p
L
p
clipping: Equalization-based Psychoacoustic Attacks against ASRs
H. Abdullah
Muhammad Sajidur Rahman
Christian Peeters
Cassidy Gibson
Washington Garcia
Vincent Bindschaedler
T. Shrimpton
Patrick Traynor
AAML
85
12
0
25 Oct 2021
Asynchronous Decentralized Distributed Training of Acoustic Models
IEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2021
Xiaodong Cui
Wei Zhang
Abdullah Kayi
Mingrui Liu
Ulrich Finkler
Brian Kingsbury
G. Saon
David S. Kung
109
3
0
21 Oct 2021
Synthesizing Optimal Parallelism Placement and Reduction Strategies on Hierarchical Systems for Deep Learning
Conference on Machine Learning and Systems (MLSys), 2021
Ningning Xie
Tamara Norman
Dominik Grewe
Dimitrios Vytiniotis
193
19
0
20 Oct 2021
Chunked Autoregressive GAN for Conditional Waveform Synthesis
Max Morrison
Rithesh Kumar
Kundan Kumar
Prem Seetharaman
Aaron Courville
Yoshua Bengio
GAN
170
84
0
19 Oct 2021
Self-Supervised Representation Learning: Introduction, Advances and Challenges
Linus Ericsson
Henry Gouk
Chen Change Loy
Timothy M. Hospedales
SSL
OOD
AI4TS
198
338
0
18 Oct 2021
Intelligent Video Editing: Incorporating Modern Talking Face Generation Algorithms in a Video Editor
Anchit Gupta
Faizan Farooq Khan
Rudrabha Mukhopadhyay
Vinay P. Namboodiri
C. V. Jawahar
CVBM
177
6
0
16 Oct 2021
Multistage linguistic conditioning of convolutional layers for speech emotion recognition
Andreas Triantafyllopoulos
U. Reichel
Shuo Liu
Simon Huber
F. Eyben
Björn W. Schuller
182
17
0
13 Oct 2021
K-Wav2vec 2.0: Automatic Speech Recognition based on Joint Decoding of Graphemes and Syllables
Interspeech (Interspeech), 2021
Jounghee Kim
Pilsung Kang
VLM
108
7
0
11 Oct 2021
Boosting Fast Adversarial Training with Learnable Adversarial Initialization
IEEE Transactions on Image Processing (TIP), 2021
Yang Liu
Yong Zhang
Baoyuan Wu
Jue Wang
Xiaochun Cao
AAML
275
65
0
11 Oct 2021
SCaLa: Supervised Contrastive Learning for End-to-End Speech Recognition
Interspeech (Interspeech), 2021
Li Fu
Xiaoxiao Li
Runyu Wang
Lu Fan
Zhengchen Zhang
Meng Chen
Youzheng Wu
Xiaodong He
SSL
148
3
0
08 Oct 2021
Explaining the Attention Mechanism of End-to-End Speech Recognition Using Decision Trees
Yuanchao Wang
Wenjing Du
Chenghao Cai
Yanyan Xu
145
1
0
08 Oct 2021
Speeding up Deep Model Training by Sharing Weights and Then Unsharing
Shuo Yang
Le Hou
Xiaodan Song
Qiang Liu
Denny Zhou
257
9
0
08 Oct 2021
Previous
1
2
3
...
6
7
8
...
20
21
22
Next