ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1512.02595
  4. Cited By
Deep Speech 2: End-to-End Speech Recognition in English and Mandarin

Deep Speech 2: End-to-End Speech Recognition in English and Mandarin

8 December 2015
Dario Amodei
Rishita Anubhai
Eric Battenberg
Carl Case
Jared Casper
Bryan Catanzaro
Jingdong Chen
Mike Chrzanowski
Adam Coates
G. Diamos
Erich Elsen
Jesse Engel
Linxi Fan
Christopher Fougner
T. Han
Awni Y. Hannun
Billy Jun
P. LeGresley
Libby Lin
Sharan Narang
A. Ng
Sherjil Ozair
R. Prenger
Jonathan Raiman
S. Satheesh
David Seetapun
Shubho Sengupta
Yi Wang
Zhiqian Wang
Chong-Jun Wang
Bo Xiao
Dani Yogatama
J. Zhan
Zhenyao Zhu
ArXiv (abs)PDFHTML

Papers citing "Deep Speech 2: End-to-End Speech Recognition in English and Mandarin"

50 / 1,096 papers shown
Title
Non-Autoregressive Machine Translation: It's Not as Fast as it Seems
Non-Autoregressive Machine Translation: It's Not as Fast as it SeemsNorth American Chapter of the Association for Computational Linguistics (NAACL), 2022
Jindvrich Helcl
Barry Haddow
Alexandra Birch
99
20
0
04 May 2022
SMLT: A Serverless Framework for Scalable and Adaptive Machine Learning
  Design and Training
SMLT: A Serverless Framework for Scalable and Adaptive Machine Learning Design and Training
Ahsan Ali
Syed Zawad
Paarijaat Aditya
Istemi Ekin Akkus
Ruichuan Chen
Feng Yan
150
11
0
04 May 2022
Wav2Seq: Pre-training Speech-to-Text Encoder-Decoder Models Using Pseudo
  Languages
Wav2Seq: Pre-training Speech-to-Text Encoder-Decoder Models Using Pseudo LanguagesIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Felix Wu
Kwangyoun Kim
Shinji Watanabe
Kyu Jeong Han
Ryan T. McDonald
Kilian Q. Weinberger
Yoav Artzi
SyDa
187
44
0
02 May 2022
DDDM: a Brain-Inspired Framework for Robust Classification
DDDM: a Brain-Inspired Framework for Robust ClassificationInternational Joint Conference on Artificial Intelligence (IJCAI), 2022
Xiyuan Chen
Xingyu Li
Yi Zhou
Tianming Yang
AAMLDiffM
113
9
0
01 May 2022
Named Entity Recognition for Audio De-Identification
Named Entity Recognition for Audio De-IdentificationIEEE International Joint Conference on Neural Network (IJCNN), 2022
Guillaume Baril
P. Cardinal
Alessandro Lameiras Koerich
95
5
0
26 Apr 2022
CellDefectNet: A Machine-designed Attention Condenser Network for
  Electroluminescence-based Photovoltaic Cell Defect Inspection
CellDefectNet: A Machine-designed Attention Condenser Network for Electroluminescence-based Photovoltaic Cell Defect InspectionCanadian Conference on Computer and Robot Vision (CRV), 2022
Carol Xu
M. Famouri
Gautam Bathla
Saeejith Nair
M. Shafiee
Alexander Wong
183
5
0
25 Apr 2022
Distributionally Robust Models with Parametric Likelihood Ratios
Distributionally Robust Models with Parametric Likelihood RatiosInternational Conference on Learning Representations (ICLR), 2022
Paul Michel
Tatsunori Hashimoto
Graham Neubig
OOD
171
20
0
13 Apr 2022
A Wav2vec2-Based Experimental Study on Self-Supervised Learning Methods
  to Improve Child Speech Recognition
A Wav2vec2-Based Experimental Study on Self-Supervised Learning Methods to Improve Child Speech RecognitionIEEE Access (IEEE Access), 2022
Rishabh Jain
Andrei Barcovschi
Mariam Yiwere
Dan Bigioi
Peter Corcoran
H. Cucu
166
49
0
06 Apr 2022
Successes and critical failures of neural networks in capturing
  human-like speech recognition
Successes and critical failures of neural networks in capturing human-like speech recognitionNeural Networks (NN), 2022
Federico Adolfi
J. Bowers
David Poeppel
UQCV
243
26
0
06 Apr 2022
Towards End-to-end Unsupervised Speech Recognition
Towards End-to-end Unsupervised Speech RecognitionSpoken Language Technology Workshop (SLT), 2022
Alexander H. Liu
Wei-Ning Hsu
Michael Auli
Alexei Baevski
SSL
208
84
0
05 Apr 2022
Lip to Speech Synthesis with Visual Context Attentional GAN
Lip to Speech Synthesis with Visual Context Attentional GANNeural Information Processing Systems (NeurIPS), 2022
Minsu Kim
Joanna Hong
Y. Ro
186
63
0
04 Apr 2022
Class-Incremental Learning by Knowledge Distillation with Adaptive
  Feature Consolidation
Class-Incremental Learning by Knowledge Distillation with Adaptive Feature ConsolidationComputer Vision and Pattern Recognition (CVPR), 2022
Minsoo Kang
Jaeyoo Park
Bohyung Han
CLL
199
226
0
02 Apr 2022
Text-To-Speech Data Augmentation for Low Resource Speech Recognition
Text-To-Speech Data Augmentation for Low Resource Speech Recognition
Rodolfo Zevallos
122
7
0
01 Apr 2022
Better Intermediates Improve CTC Inference
Better Intermediates Improve CTC InferenceInterspeech (Interspeech), 2022
Tatsuya Komatsu
Yusuke Fujita
Jaesong Lee
Lukas Lee
Shinji Watanabe
Yusuke Kida
113
6
0
01 Apr 2022
PADA: Pruning Assisted Domain Adaptation for Self-Supervised Speech
  Representations
PADA: Pruning Assisted Domain Adaptation for Self-Supervised Speech RepresentationsSpoken Language Technology Workshop (SLT), 2022
L. D. Prasad
Sreyan Ghosh
S. Umesh
256
14
0
31 Mar 2022
Memory-Efficient Training of RNN-Transducer with Sampled Softmax
Memory-Efficient Training of RNN-Transducer with Sampled SoftmaxInterspeech (Interspeech), 2022
Jaesong Lee
Lukas Lee
Shinji Watanabe
262
8
0
31 Mar 2022
Recent improvements of ASR models in the face of adversarial attacks
Recent improvements of ASR models in the face of adversarial attacksInterspeech (Interspeech), 2022
R. Olivier
Bhiksha Raj
AAML
193
17
0
29 Mar 2022
Streaming parallel transducer beam search with fast-slow cascaded
  encoders
Streaming parallel transducer beam search with fast-slow cascaded encodersInterspeech (Interspeech), 2022
Jay Mahadeokar
Yangyang Shi
Ke Li
Duc Le
Jiedan Zhu
Vikas Chandra
Ozlem Kalinli
M. Seltzer
164
17
0
29 Mar 2022
WeNet 2.0: More Productive End-to-End Speech Recognition Toolkit
WeNet 2.0: More Productive End-to-End Speech Recognition ToolkitInterspeech (Interspeech), 2022
Binbin Zhang
Di Wu
Zhendong Peng
Xingcheng Song
Zhuoyuan Yao
Hang Lv
Linfu Xie
Chao Yang
Fuping Pan
Jianwei Niu
VLM
240
126
0
29 Mar 2022
Mel Frequency Spectral Domain Defenses against Adversarial Attacks on
  Speech Recognition Systems
Mel Frequency Spectral Domain Defenses against Adversarial Attacks on Speech Recognition SystemsJASA Express Letters (JE), 2022
Nicholas Mehlman
Anirudh Sreeram
Raghuveer Peri
Shrikanth Narayanan
AAML
231
5
0
29 Mar 2022
Training speaker recognition systems with limited data
Training speaker recognition systems with limited dataInterspeech (Interspeech), 2022
Nik Vaessen
David A. van Leeuwen
188
6
0
28 Mar 2022
Impact of Dataset on Acoustic Models for Automatic Speech Recognition
Impact of Dataset on Acoustic Models for Automatic Speech Recognition
S. Singh
30
0
0
25 Mar 2022
Neural Predictor for Black-Box Adversarial Attacks on Speech Recognition
Neural Predictor for Black-Box Adversarial Attacks on Speech Recognition
Marie Biolková
Bac Nguyen
AAML
133
2
0
18 Mar 2022
Self-Normalized Density Map (SNDM) for Counting Microbiological Objects
Self-Normalized Density Map (SNDM) for Counting Microbiological ObjectsScientific Reports (Sci Rep), 2022
K. Graczyk
J. Pawlowski
Sylwia Majchrowska
Tomasz Golan
189
12
0
15 Mar 2022
DialogueNeRF: Towards Realistic Avatar Face-to-Face Conversation Video
  Generation
DialogueNeRF: Towards Realistic Avatar Face-to-Face Conversation Video Generation
Manwen Liao
Zanwei Zhou
Zi Wang
Chen-Ning Yang
Xiaokang Yang
CVBM
175
35
0
15 Mar 2022
Magnetic Field Prediction Using Generative Adversarial Networks
Magnetic Field Prediction Using Generative Adversarial NetworksJournal of Magnetism and Magnetic Materials (JMMM), 2022
Stefan Pollok
Nataniel Olden-Jorgensen
P. S. Jørgensen
Rasmus Bjørk
GANAI4CE
120
22
0
14 Mar 2022
Attacks as Defenses: Designing Robust Audio CAPTCHAs Using Attacks on
  Automatic Speech Recognition Systems
Attacks as Defenses: Designing Robust Audio CAPTCHAs Using Attacks on Automatic Speech Recognition SystemsNetwork and Distributed System Security Symposium (NDSS), 2022
H. Abdullah
Aditya Karlekar
S. Prasad
Muhammad Sajidur Rahman
Logan Blue
L. A. Bauer
Vincent Bindschaedler
Patrick Traynor
AAML
123
4
0
10 Mar 2022
Shfl-BW: Accelerating Deep Neural Network Inference with Tensor-Core
  Aware Weight Pruning
Shfl-BW: Accelerating Deep Neural Network Inference with Tensor-Core Aware Weight PruningDesign Automation Conference (DAC), 2022
Guyue Huang
Haoran Li
Minghai Qin
Fei Sun
Yufei Din
Yuan Xie
150
21
0
09 Mar 2022
Towards Generalized Models for Task-oriented Dialogue Modeling on Spoken
  Conversations
Towards Generalized Models for Task-oriented Dialogue Modeling on Spoken Conversations
Ruijie Yan
Shuang Peng
Haitao Mi
Liang Jiang
Shihui Yang
Yuchi Zhang
Jiajun Li
Liangrui Peng
Yongliang Wang
Zujie Wen
93
4
0
08 Mar 2022
GaitEdge: Beyond Plain End-to-end Gait Recognition for Better
  Practicality
GaitEdge: Beyond Plain End-to-end Gait Recognition for Better PracticalityEuropean Conference on Computer Vision (ECCV), 2022
Junhao Liang
Chao Fan
Saihui Hou
Chuanfu Shen
Yongzhen Huang
Shiqi Yu
CVBM
171
96
0
08 Mar 2022
aaeCAPTCHA: The Design and Implementation of Audio Adversarial CAPTCHA
aaeCAPTCHA: The Design and Implementation of Audio Adversarial CAPTCHAEuropean Symposium on Security and Privacy (Euro S&P), 2022
Md. Imran Hossen
X. Hei
114
9
0
05 Mar 2022
A Conformer Based Acoustic Model for Robust Automatic Speech Recognition
A Conformer Based Acoustic Model for Robust Automatic Speech Recognition
Yufeng Yang
Peidong Wang
DeLiang Wang
249
13
0
01 Mar 2022
A Survey of Multilingual Models for Automatic Speech Recognition
A Survey of Multilingual Models for Automatic Speech RecognitionInternational Conference on Language Resources and Evaluation (LREC), 2022
Hemant Yadav
Sunayana Sitaram
135
47
0
25 Feb 2022
Differentially Private Speaker Anonymization
Differentially Private Speaker AnonymizationProceedings on Privacy Enhancing Technologies (PoPETs), 2022
Ali Shahin Shamsabadi
B. M. L. Srivastava
A. Bellet
Nathalie Vauquier
Emmanuel Vincent
Mohamed Maouche
Marc Tommasi
Nicolas Papernot
MIACV
298
48
0
23 Feb 2022
Memory Planning for Deep Neural Networks
Memory Planning for Deep Neural Networks
Maksim Levental
162
4
0
23 Feb 2022
Korean Tokenization for Beam Search Rescoring in Speech Recognition
Korean Tokenization for Beam Search Rescoring in Speech RecognitionInternational Conference on Electronics, Information and Communications (ICEIC), 2022
Kyuhong Shim
Hyewon Bae
Wonyong Sung
140
0
0
22 Feb 2022
HRel: Filter Pruning based on High Relevance between Activation Maps and
  Class Labels
HRel: Filter Pruning based on High Relevance between Activation Maps and Class LabelsNeural Networks (NN), 2021
C. Sarvani
Mrinmoy Ghorai
S. Dubey
S. H. Shabbeer Basha
VLM
266
45
0
22 Feb 2022
Adversarial Attacks on Speech Recognition Systems for Mission-Critical
  Applications: A Survey
Adversarial Attacks on Speech Recognition Systems for Mission-Critical Applications: A Survey
Ngoc Dung Huynh
Mohamed Reda Bouadjenek
Imran Razzak
Kevin Lee
Chetan Arora
Ali Hassani
A. Zaslavsky
AAML
165
8
0
22 Feb 2022
Spanish and English Phoneme Recognition by Training on Simulated
  Classroom Audio Recordings of Collaborative Learning Environments
Spanish and English Phoneme Recognition by Training on Simulated Classroom Audio Recordings of Collaborative Learning Environments
Mario Esparza
151
0
0
21 Feb 2022
Where Is My Training Bottleneck? Hidden Trade-Offs in Deep Learning
  Preprocessing Pipelines
Where Is My Training Bottleneck? Hidden Trade-Offs in Deep Learning Preprocessing Pipelines
Alexander Isenko
R. Mayer
Jeffrey Jedele
Hans-Arno Jacobsen
300
28
0
17 Feb 2022
Multi-style Training for South African Call Centre Audio
Multi-style Training for South African Call Centre Audio
Walter Heymans
Marelie Hattingh Davel
C. van Heerden
67
3
0
15 Feb 2022
DeepONet-Grid-UQ: A Trustworthy Deep Operator Framework for Predicting
  the Power Grid's Post-Fault Trajectories
DeepONet-Grid-UQ: A Trustworthy Deep Operator Framework for Predicting the Power Grid's Post-Fault TrajectoriesNeurocomputing (Neurocomputing), 2022
Christian Moya
Shiqi Zhang
Meng Yue
Guang Lin
163
56
0
15 Feb 2022
Saving RNN Computations with a Neuron-Level Fuzzy Memoization Scheme
Saving RNN Computations with a Neuron-Level Fuzzy Memoization Scheme
Franyell Silfa
J. Arnau
Antonio González
86
1
0
14 Feb 2022
Compute Trends Across Three Eras of Machine Learning
Compute Trends Across Three Eras of Machine LearningIEEE International Joint Conference on Neural Network (IJCNN), 2022
J. Sevilla
Lennart Heim
A. Ho
T. Besiroglu
Marius Hobbhahn
Pablo Villalobos
485
352
0
11 Feb 2022
FAAG: Fast Adversarial Audio Generation through Interactive Attack
  Optimisation
FAAG: Fast Adversarial Audio Generation through Interactive Attack Optimisation
Yuantian Miao
Chao Chen
Lei Pan
Jun Zhang
Yang Xiang
AAML
170
4
0
11 Feb 2022
ASRPU: A Programmable Accelerator for Low-Power Automatic Speech
  Recognition
ASRPU: A Programmable Accelerator for Low-Power Automatic Speech RecognitionSocial Science Research Network (SSRN), 2022
D. Pinto
J. Arnau
Antonio González
64
0
0
10 Feb 2022
Conversational Agents: Theory and Applications
Conversational Agents: Theory and Applications
M. Wahde
M. Virgolin
LLMAG
171
30
0
07 Feb 2022
Towards Training Reproducible Deep Learning Models
Towards Training Reproducible Deep Learning ModelsInternational Conference on Software Engineering (ICSE), 2022
Boyuan Chen
Mingzhi Wen
Yong Shi
Dayi Lin
Gopi Krishnan Rajbahadur
Zhen Ming
Z. Jiang
SyDa
134
45
0
04 Feb 2022
Polyphonic pitch detection with convolutional recurrent neural networks
Polyphonic pitch detection with convolutional recurrent neural networks
Carl Thomé
Sven Ahlback
148
8
0
04 Feb 2022
Learning strides in convolutional neural networks
Learning strides in convolutional neural networksInternational Conference on Learning Representations (ICLR), 2022
Rachid Riad
O. Teboul
David Grangier
Neil Zeghidour
148
49
0
03 Feb 2022
Previous
123...567...202122
Next