ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1512.02595
  4. Cited By
Deep Speech 2: End-to-End Speech Recognition in English and Mandarin

Deep Speech 2: End-to-End Speech Recognition in English and Mandarin

8 December 2015
Dario Amodei
Rishita Anubhai
Eric Battenberg
Carl Case
Jared Casper
Bryan Catanzaro
Jingdong Chen
Mike Chrzanowski
Adam Coates
G. Diamos
Erich Elsen
Jesse Engel
Linxi Fan
Christopher Fougner
T. Han
Awni Y. Hannun
Billy Jun
P. LeGresley
Libby Lin
Sharan Narang
A. Ng
Sherjil Ozair
R. Prenger
Jonathan Raiman
S. Satheesh
David Seetapun
Shubho Sengupta
Yi Wang
Zhiqian Wang
Chong-Jun Wang
Bo Xiao
Dani Yogatama
J. Zhan
Zhenyao Zhu
ArXiv (abs)PDFHTML

Papers citing "Deep Speech 2: End-to-End Speech Recognition in English and Mandarin"

50 / 1,096 papers shown
Title
Experiments on Turkish ASR with Self-Supervised Speech Representation
  Learning
Experiments on Turkish ASR with Self-Supervised Speech Representation Learning
Ali Safaya
E. Erzin
184
1
0
13 Oct 2022
Deep learning model compression using network sensitivity and gradients
Deep learning model compression using network sensitivity and gradients
M. Sakthi
N. Yadla
Raj Pawate
150
2
0
11 Oct 2022
TAN Without a Burn: Scaling Laws of DP-SGD
TAN Without a Burn: Scaling Laws of DP-SGDInternational Conference on Machine Learning (ICML), 2022
Tom Sander
Pierre Stock
Alexandre Sablayrolles
FedML
270
53
0
07 Oct 2022
CCC-wav2vec 2.0: Clustering aided Cross Contrastive Self-supervised
  learning of speech representations
CCC-wav2vec 2.0: Clustering aided Cross Contrastive Self-supervised learning of speech representationsSpoken Language Technology Workshop (SLT), 2022
Vasista Sai Lodagala
Sreyan Ghosh
S. Umesh
SSL
313
24
0
05 Oct 2022
How deep convolutional neural networks lose spatial information with
  training
How deep convolutional neural networks lose spatial information with training
Umberto M. Tomasini
Leonardo Petrini
Francesco Cagnetta
Matthieu Wyart
195
14
0
04 Oct 2022
A Comparison of Transformer, Convolutional, and Recurrent Neural
  Networks on Phoneme Recognition
A Comparison of Transformer, Convolutional, and Recurrent Neural Networks on Phoneme Recognition
Kyuhong Shim
Wonyong Sung
157
2
0
01 Oct 2022
CRISP: Curriculum based Sequential Neural Decoders for Polar Code Family
CRISP: Curriculum based Sequential Neural Decoders for Polar Code FamilyInternational Conference on Machine Learning (ICML), 2022
Ashwin Hebbar
Viraj Nadkarni
Ashok Vardhan Makkuva
S. Bhat
Sewoong Oh
Pramod Viswanath
266
11
0
01 Oct 2022
A Survey on Physical Adversarial Attack in Computer Vision
A Survey on Physical Adversarial Attack in Computer Vision
Donghua Wang
Wen Yao
Tingsong Jiang
Guijian Tang
Xiaoqian Chen
AAML
458
47
0
28 Sep 2022
InFi: End-to-End Learning to Filter Input for Resource-Efficiency in
  Mobile-Centric Inference
InFi: End-to-End Learning to Filter Input for Resource-Efficiency in Mobile-Centric InferenceIEEE Transactions on Mobile Computing (IEEE TMC), 2022
Mu Yuan
Lan Zhang
Fengxiang He
Xueting Tong
Miao-Hui Song
Zhengyuan Xu
Xiang-Yang Li
279
4
0
28 Sep 2022
NWPU-ASLP System for the VoicePrivacy 2022 Challenge
NWPU-ASLP System for the VoicePrivacy 2022 Challenge
Jixun Yao
Qing Wang
Li Zhang
Pengcheng Guo
Yuhao Liang
Linfu Xie
PICV
133
19
0
24 Sep 2022
HAPI: A Large-scale Longitudinal Dataset of Commercial ML API
  Predictions
HAPI: A Large-scale Longitudinal Dataset of Commercial ML API PredictionsNeural Information Processing Systems (NeurIPS), 2022
Lingjiao Chen
Zhihua Jin
Sabri Eyuboglu
Christopher Ré
Matei A. Zaharia
James Zou
187
9
0
18 Sep 2022
Watch What You Pretrain For: Targeted, Transferable Adversarial Examples
  on Self-Supervised Speech Recognition models
Watch What You Pretrain For: Targeted, Transferable Adversarial Examples on Self-Supervised Speech Recognition models
R. Olivier
H. Abdullah
Bhiksha Raj
AAML
222
1
0
17 Sep 2022
Lip-to-Speech Synthesis for Arbitrary Speakers in the Wild
Lip-to-Speech Synthesis for Arbitrary Speakers in the WildACM Multimedia (ACM MM), 2022
Sindhu B. Hegde
Prajwal K R
Rudrabha Mukhopadhyay
Vinay P. Namboodiri
C. V. Jawahar
190
14
0
01 Sep 2022
StableFace: Analyzing and Improving Motion Stability for Talking Face
  Generation
StableFace: Analyzing and Improving Motion Stability for Talking Face GenerationIEEE Journal on Selected Topics in Signal Processing (IEEE JSTSP), 2022
Jun Ling
Xuejiao Tan
Liyang Chen
Runnan Li
Yuchao Zhang
Sheng Zhao
Liang Song
CVBM
152
17
0
29 Aug 2022
Not All GPUs Are Created Equal: Characterizing Variability in
  Large-Scale, Accelerator-Rich Systems
Not All GPUs Are Created Equal: Characterizing Variability in Large-Scale, Accelerator-Rich SystemsInternational Conference for High Performance Computing, Networking, Storage and Analysis (SC), 2022
Prasoon Sinha
Akhil Guliani
Rutwik Jain
Brandon Tran
Matthew D. Sinclair
Shivaram Venkataraman
161
29
0
23 Aug 2022
Low-Level Physiological Implications of End-to-End Learning of Speech
  Recognition
Low-Level Physiological Implications of End-to-End Learning of Speech RecognitionInterspeech (Interspeech), 2022
Louise Coppieters de Gibson
Philip N. Garner
126
2
0
22 Aug 2022
Resisting Adversarial Attacks in Deep Neural Networks using Diverse
  Decision Boundaries
Resisting Adversarial Attacks in Deep Neural Networks using Diverse Decision Boundaries
Manaar Alam
Shubhajit Datta
Debdeep Mukhopadhyay
Arijit Mondal
P. Chakrabarti
AAML
127
5
0
18 Aug 2022
Comparison and Analysis of New Curriculum Criteria for End-to-End ASR
Comparison and Analysis of New Curriculum Criteria for End-to-End ASRInterspeech (Interspeech), 2022
Georgios Karakasidis
Tamás Grósz
M. Kurimo
122
3
0
10 Aug 2022
OLLIE: Derivation-based Tensor Program Optimizer
OLLIE: Derivation-based Tensor Program Optimizer
Liyan Zheng
Haojie Wang
Jidong Zhai
Muyan Hu
Zixuan Ma
Tuowei Wang
Shizhi Tang
Lei Xie
Kezhao Huang
Zhihao Jia
128
3
0
02 Aug 2022
A 23 $μ$W Keyword Spotting IC with Ring-Oscillator-Based Time-Domain
  Feature Extraction
A 23 μμμW Keyword Spotting IC with Ring-Oscillator-Based Time-Domain Feature ExtractionIEEE Journal of Solid-State Circuits (JSSC), 2022
Kwantae Kim
Chang Gao
Rui Gracca
Ilya Kiselev
H. Yoo
T. Delbruck
Shih-Chii Liu
163
32
0
01 Aug 2022
Domain Specific Wav2vec 2.0 Fine-tuning For The SE&R 2022 Challenge
Domain Specific Wav2vec 2.0 Fine-tuning For The SE&R 2022 Challenge
A. I. S. Ferreira
Gustavo dos Reis Oliveira
165
3
0
29 Jul 2022
Federated Selective Aggregation for Knowledge Amalgamation
Federated Selective Aggregation for Knowledge Amalgamation
Don Xie
Ruonan Yu
Gongfan Fang
Mingli Song
Zunlei Feng
Xinchao Wang
Li Sun
Weilong Dai
FedML
128
4
0
27 Jul 2022
MISO: Exploiting Multi-Instance GPU Capability on Multi-Tenant Systems
  for Machine Learning
MISO: Exploiting Multi-Instance GPU Capability on Multi-Tenant Systems for Machine LearningACM Symposium on Cloud Computing (SoCC), 2022
Baolin Li
Tirthak Patel
S. Samsi
V. Gadepally
Devesh Tiwari
171
88
0
23 Jul 2022
The Neural Race Reduction: Dynamics of Abstraction in Gated Networks
The Neural Race Reduction: Dynamics of Abstraction in Gated NetworksInternational Conference on Machine Learning (ICML), 2022
Andrew M. Saxe
Shagun Sodhani
Sam Lewallen
AI4CE
182
43
0
21 Jul 2022
End-to-End Spoken Language Understanding: Performance analyses of a
  voice command task in a low resource setting
End-to-End Spoken Language Understanding: Performance analyses of a voice command task in a low resource settingComputer Speech and Language (CSL), 2022
Thierry Desot
François Portet
Michel Vacher
103
15
0
17 Jul 2022
Data Augmentation for Low-Resource Quechua ASR Improvement
Data Augmentation for Low-Resource Quechua ASR ImprovementInterspeech (Interspeech), 2022
Rodolfo Zevallos
Núria Bel
Guillermo Cámbara
Mireia Farrús
Jordi Luque
VLMSyDa
87
10
0
14 Jul 2022
The ACII 2022 Affective Vocal Bursts Workshop & Competition:
  Understanding a critically understudied modality of emotional expression
The ACII 2022 Affective Vocal Bursts Workshop & Competition: Understanding a critically understudied modality of emotional expression
Alice Baird
Panagiotis Tzirakis
Jeffrey A. Brooks
Christopher B. Gregory
Björn Schuller
A. Batliner
D. Keltner
Alan S. Cowen
103
17
0
07 Jul 2022
Rapid training of quantum recurrent neural networks
Rapid training of quantum recurrent neural networksQuantum Machine Intelligence (QMI), 2022
M. Siemaszko
A. Buraczewski
Bertrand Le Saux
Magdalena Stobiñska
254
14
0
01 Jul 2022
Data-Efficient Learning via Minimizing Hyperspherical Energy
Data-Efficient Learning via Minimizing Hyperspherical EnergyIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022
Xiaofeng Cao
Weiyang Liu
Ivor W. Tsang
159
12
0
30 Jun 2022
An extensible Benchmarking Graph-Mesh dataset for studying Steady-State
  Incompressible Navier-Stokes Equations
An extensible Benchmarking Graph-Mesh dataset for studying Steady-State Incompressible Navier-Stokes Equations
F. Bonnet
Jocelyn Ahmed Mazari
T. Munzer
P. Yser
Patrick Gallinari
AI4CE
171
11
0
29 Jun 2022
Wav2Vec-Aug: Improved self-supervised training with limited data
Wav2Vec-Aug: Improved self-supervised training with limited dataInterspeech (Interspeech), 2022
Anuroop Sriram
Michael Auli
Alexei Baevski
SSLVLM
165
16
0
27 Jun 2022
Self-Healing Robust Neural Networks via Closed-Loop Control
Self-Healing Robust Neural Networks via Closed-Loop ControlJournal of machine learning research (JMLR), 2022
Zhuotong Chen
Qianxiao Li
Zheng Zhang
AAMLOOD
109
11
0
26 Jun 2022
Transfer Learning for Robust Low-Resource Children's Speech ASR with
  Transformers and Source-Filter Warping
Transfer Learning for Robust Low-Resource Children's Speech ASR with Transformers and Source-Filter WarpingInterspeech (Interspeech), 2022
Jenthe Thienpondt
Kris Demuynck
192
13
0
19 Jun 2022
Residual Language Model for End-to-end Speech Recognition
Residual Language Model for End-to-end Speech RecognitionInterspeech (Interspeech), 2022
E. Tsunoo
Yosuke Kashiwagi
Chaitanya Narisetty
Shinji Watanabe
127
11
0
15 Jun 2022
LADDER: Latent Boundary-guided Adversarial Training
LADDER: Latent Boundary-guided Adversarial TrainingMachine-mediated learning (ML), 2022
Xiaowei Zhou
Ivor W. Tsang
Jie Yin
AAML
127
9
0
08 Jun 2022
LegoNN: Building Modular Encoder-Decoder Models
LegoNN: Building Modular Encoder-Decoder ModelsIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2022
Siddharth Dalmia
Dmytro Okhonko
M. Lewis
Sergey Edunov
Shinji Watanabe
Florian Metze
Luke Zettlemoyer
Abdel-rahman Mohamed
AuLLMMoE
135
16
0
07 Jun 2022
Predicting and Understanding Human Action Decisions during Skillful
  Joint-Action via Machine Learning and Explainable-AI
Predicting and Understanding Human Action Decisions during Skillful Joint-Action via Machine Learning and Explainable-AI
Fabrizia Auletta
Rachel W. Kallen
M. D. Bernardo
Micheal J. Richardson
87
2
0
06 Jun 2022
Lip-Listening: Mixing Senses to Understand Lips using Cross Modality
  Knowledge Distillation for Word-Based Models
Lip-Listening: Mixing Senses to Understand Lips using Cross Modality Knowledge Distillation for Word-Based Models
Hadeel Mabrouk
Omar Abugabal
Nourhan Sakr
Hesham M. Eraqi
VLM
114
2
0
05 Jun 2022
Toward a realistic model of speech processing in the brain with
  self-supervised learning
Toward a realistic model of speech processing in the brain with self-supervised learningNeural Information Processing Systems (NeurIPS), 2022
Juliette Millet
Charlotte Caucheteux
Pierre Orhan
Yves Boubenec
Alexandre Gramfort
Ewan Dunbar
Christophe Pallier
J. King
234
124
0
03 Jun 2022
Deep neural networks can stably solve high-dimensional, noisy,
  non-linear inverse problems
Deep neural networks can stably solve high-dimensional, noisy, non-linear inverse problemsAnalysis and Applications (Anal. Appl.), 2022
Andrés Felipe Lerma Pineda
P. Petersen
258
6
0
02 Jun 2022
FELARE: Fair Scheduling of Machine Learning Tasks on Heterogeneous Edge
  Systems
FELARE: Fair Scheduling of Machine Learning Tasks on Heterogeneous Edge SystemsIEEE International Conference on Cloud Computing (CLOUD), 2022
Ali Mokhtari
Md. Abir Hossen
Pooyan Jamshidi
M. Salehi
202
11
0
31 May 2022
Do self-supervised speech models develop human-like perception biases?
Do self-supervised speech models develop human-like perception biases?Annual Meeting of the Association for Computational Linguistics (ACL), 2022
Juliette Millet
Ewan Dunbar
SSL
112
25
0
31 May 2022
Self-supervised models of audio effectively explain human cortical
  responses to speech
Self-supervised models of audio effectively explain human cortical responses to speechInternational Conference on Machine Learning (ICML), 2022
Aditya R. Vaidya
Shailee Jain
Alexander G. Huth
177
68
0
27 May 2022
Transfer and Share: Semi-Supervised Learning from Long-Tailed Data
Transfer and Share: Semi-Supervised Learning from Long-Tailed DataMachine-mediated learning (ML), 2022
Tong Wei
Qianqian Liu
Jiang-Xin Shi
Wei-Wei Tu
Lan-Zhe Guo
135
16
0
26 May 2022
Joint Training of Speech Enhancement and Self-supervised Model for
  Noise-robust ASR
Joint Training of Speech Enhancement and Self-supervised Model for Noise-robust ASR
Qiu-shi Zhu
Jie Zhang
Zitian Zhang
Lirong Dai
174
18
0
26 May 2022
PaddleSpeech: An Easy-to-Use All-in-One Speech Toolkit
PaddleSpeech: An Easy-to-Use All-in-One Speech ToolkitNorth American Chapter of the Association for Computational Linguistics (NAACL), 2022
Hui Zhang
Tian Yuan
Junkun Chen
Xintong Li
Renjie Zheng
...
Zeyu Chen
Xiaoguang Hu
Dianhai Yu
Yanjun Ma
Liang Huang
AuLLM
133
31
0
20 May 2022
Insights on Neural Representations for End-to-End Speech Recognition
Insights on Neural Representations for End-to-End Speech RecognitionInterspeech (Interspeech), 2021
A. Ollerenshaw
Md. Asif Jalal
Thomas Hain
107
10
0
19 May 2022
Learning Rate Curriculum
Learning Rate CurriculumInternational Journal of Computer Vision (IJCV), 2022
Florinel-Alin Croitoru
Nicolae-Cătălin Ristea
Radu Tudor Ionescu
Andrii Zadaianchuk
188
23
0
18 May 2022
Deep Learning Enabled Semantic Communications with Speech Recognition
  and Synthesis
Deep Learning Enabled Semantic Communications with Speech Recognition and SynthesisIEEE Transactions on Wireless Communications (TWC), 2022
Zhenzi Weng
Zhijin Qin
Xiaoming Tao
Chengkang Pan
Guangyi Liu
Geoffrey Ye Li
189
194
0
09 May 2022
A Highly Adaptive Acoustic Model for Accurate Multi-Dialect Speech
  Recognition
A Highly Adaptive Acoustic Model for Accurate Multi-Dialect Speech RecognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2019
Sanghyun Yoo
Inchul Song
Yoshua Bengio
130
32
0
06 May 2022
Previous
123456...202122
Next