ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1412.5567
  4. Cited By
Deep Speech: Scaling up end-to-end speech recognition
v1v2 (latest)

Deep Speech: Scaling up end-to-end speech recognition

17 December 2014
Awni Y. Hannun
Carl Case
Jared Casper
Bryan Catanzaro
G. Diamos
Erich Elsen
R. Prenger
S. Satheesh
Shubho Sengupta
Adam Coates
A. Ng
ArXiv (abs)PDFHTML

Papers citing "Deep Speech: Scaling up end-to-end speech recognition"

50 / 768 papers shown
Real-Time Neural Voice Camouflage
Real-Time Neural Voice Camouflage
Mia Chiquier
Chengzhi Mao
Carl Vondrick
195
8
0
14 Dec 2021
Detecting Audio Adversarial Examples with Logit Noising
Detecting Audio Adversarial Examples with Logit Noising
N. Park
Sangwoo Ji
Jong Kim
AAML
184
5
0
13 Dec 2021
Finding Deviated Behaviors of the Compressed DNN Models for Image
  Classifications
Finding Deviated Behaviors of the Compressed DNN Models for Image ClassificationsACM Transactions on Software Engineering and Methodology (TOSEM), 2021
Yongqiang Tian
Wuqi Zhang
Ming Wen
Shing-Chi Cheung
Chengnian Sun
Shiqing Ma
Yu Jiang
230
9
0
06 Dec 2021
Joint Audio-Text Model for Expressive Speech-Driven 3D Facial Animation
Joint Audio-Text Model for Expressive Speech-Driven 3D Facial AnimationProceedings of the ACM on Computer Graphics and Interactive Techniques (PACMCGIT), 2021
Yingruo Fan
Mohammad Kachuee
Jun Saito
Wenping Wang
Taku Komura
175
27
0
04 Dec 2021
Catch Me If You Can: Blackbox Adversarial Attacks on Automatic Speech
  Recognition using Frequency Masking
Catch Me If You Can: Blackbox Adversarial Attacks on Automatic Speech Recognition using Frequency Masking
Xiao-lan Wu
A. Rajan
AAML
233
7
0
03 Dec 2021
Transformer-S2A: Robust and Efficient Speech-to-Animation
Transformer-S2A: Robust and Efficient Speech-to-Animation
Liyang Chen
Zhiyong Wu
Jun Ling
Runnan Li
Xu Tan
Sheng Zhao
216
19
0
18 Nov 2021
A Survey on Adversarial Attacks for Malware Analysis
A Survey on Adversarial Attacks for Malware AnalysisIEEE Access (IEEE Access), 2021
Kshitiz Aryal
Maanak Gupta
Mahmoud Abdelsalam
AAML
302
65
0
16 Nov 2021
Neural Population Geometry Reveals the Role of Stochasticity in Robust
  Perception
Neural Population Geometry Reveals the Role of Stochasticity in Robust PerceptionNeural Information Processing Systems (NeurIPS), 2021
Joel Dapello
J. Feather
Hang Le
Tiago Marques
David D. Cox
Josh H. McDermott
J. DiCarlo
SueYeon Chung
AAMLOOD
132
26
0
12 Nov 2021
Recent Advances in End-to-End Automatic Speech Recognition
Recent Advances in End-to-End Automatic Speech RecognitionAPSIPA Transactions on Signal and Information Processing (TASIP), 2021
Jinyu Li
VLM
431
427
0
02 Nov 2021
With a Little Help from my Temporal Context: Multimodal Egocentric
  Action Recognition
With a Little Help from my Temporal Context: Multimodal Egocentric Action RecognitionBritish Machine Vision Conference (BMVC), 2021
Evangelos Kazakos
Jaesung Huh
Arsha Nagrani
Andrew Zisserman
Dima Damen
EgoV
297
54
0
01 Nov 2021
Imitating Arbitrary Talking Style for Realistic Audio-DrivenTalking Face
  Synthesis
Imitating Arbitrary Talking Style for Realistic Audio-DrivenTalking Face SynthesisACM Multimedia (ACM MM), 2021
Haozhe Wu
Jia Jia
Haoyu Wang
Yishun Dou
Chao Duan
Qingshan Deng
CVBM
183
83
0
30 Oct 2021
TorchAudio: Building Blocks for Audio and Speech Processing
TorchAudio: Building Blocks for Audio and Speech ProcessingIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021
Yao-Yuan Yang
Moto Hira
Zhaoheng Ni
Anjali Chourdia
Artyom Astafurov
...
Mehrzad Samadi
Shinji Watanabe
Soumith Chintala
Vincent Quenneville-Bélair
Yangyang Shi
170
190
0
28 Oct 2021
Beyond $L_p$ clipping: Equalization-based Psychoacoustic Attacks against
  ASRs
Beyond LpL_pLp​ clipping: Equalization-based Psychoacoustic Attacks against ASRs
H. Abdullah
Muhammad Sajidur Rahman
Christian Peeters
Cassidy Gibson
Washington Garcia
Vincent Bindschaedler
T. Shrimpton
Patrick Traynor
AAML
99
12
0
25 Oct 2021
Deep Neural Networks on EEG Signals to Predict Auditory Attention Score
  Using Gramian Angular Difference Field
Deep Neural Networks on EEG Signals to Predict Auditory Attention Score Using Gramian Angular Difference Field
Mahak Kothari
Shreyansh Joshi
Adarsh Nandanwar
Aadetya Jaiswal
V. Baths
72
1
0
24 Oct 2021
Asynchronous Decentralized Distributed Training of Acoustic Models
Asynchronous Decentralized Distributed Training of Acoustic ModelsIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2021
Xiaodong Cui
Wei Zhang
Abdullah Kayi
Mingrui Liu
Ulrich Finkler
Brian Kingsbury
G. Saon
David S. Kung
126
3
0
21 Oct 2021
Activation Landscapes as a Topological Summary of Neural Network
  Performance
Activation Landscapes as a Topological Summary of Neural Network Performance
Matthew Wheeler
Jose J. Bouza
Peter Bubenik
172
22
0
19 Oct 2021
Speech Pattern based Black-box Model Watermarking for Automatic Speech
  Recognition
Speech Pattern based Black-box Model Watermarking for Automatic Speech Recognition
Haozhe Chen
Weiming Zhang
Kunlin Liu
Kejiang Chen
Han Fang
Nenghai Yu
94
4
0
19 Oct 2021
Black-box Adversarial Attacks on Commercial Speech Platforms with
  Minimal Information
Black-box Adversarial Attacks on Commercial Speech Platforms with Minimal Information
Baolin Zheng
Peipei Jiang
Qian Wang
Qi Li
Chao Shen
Cong Wang
Yunjie Ge
Qingyang Teng
Shenyi Zhang
AAML
146
87
0
19 Oct 2021
Intent Classification Using Pre-trained Language Agnostic Embeddings For
  Low Resource Languages
Intent Classification Using Pre-trained Language Agnostic Embeddings For Low Resource Languages
Hemant Yadav
Akshat Gupta
Sai Krishna Rallabandi
A. Black
R. Shah
93
1
0
18 Oct 2021
Towards Robust Waveform-Based Acoustic Models
Towards Robust Waveform-Based Acoustic Models
Dino Oglic
Zoran Cvetkovic
Peter Sollich
Steve Renals
Bin Yu
OODAAML
206
4
0
16 Oct 2021
On Language Model Integration for RNN Transducer based Speech
  Recognition
On Language Model Integration for RNN Transducer based Speech Recognition
Wei Zhou
Zuoyun Zheng
Ralf Schluter
Hermann Ney
265
27
0
13 Oct 2021
Synergy: Resource Sensitive DNN Scheduling in Multi-Tenant Clusters
Synergy: Resource Sensitive DNN Scheduling in Multi-Tenant Clusters
Jayashree Mohan
Amar Phanishayee
Janardhan Kulkarni
Vijay Chidambaram
GNN
239
8
0
12 Oct 2021
Automated Testing of AI Models
Automated Testing of AI Models
Swagatam Haldar
Deepak Vijaykeerthy
Diptikalyan Saha
VLM
114
0
0
07 Oct 2021
Internal Language Model Adaptation with Text-Only Data for End-to-End
  Speech Recognition
Internal Language Model Adaptation with Text-Only Data for End-to-End Speech Recognition
Zhong Meng
Yashesh Gaur
Naoyuki Kanda
Jinyu Li
Xie Chen
Yu Wu
Yifan Gong
AuLLM
220
34
0
06 Oct 2021
Building a Noisy Audio Dataset to Evaluate Machine Learning Approaches
  for Automatic Speech Recognition Systems
Building a Noisy Audio Dataset to Evaluate Machine Learning Approaches for Automatic Speech Recognition Systems
J. C. Duarte
S. Colcher
54
4
0
04 Oct 2021
Anti-aliasing Deep Image Classifiers using Novel Depth Adaptive Blurring
  and Activation Function
Anti-aliasing Deep Image Classifiers using Novel Depth Adaptive Blurring and Activation Function
Md Tahmid Hossain
S. Teng
Ferdous Sohel
Guojun Lu
165
18
0
03 Oct 2021
SpliceOut: A Simple and Efficient Audio Augmentation Method
SpliceOut: A Simple and Efficient Audio Augmentation Method
Arjit Jain
Pranay Reddy Samala
Deepak Mittal
Preethi Jyothi
M. Singh
451
13
0
30 Sep 2021
Challenges and Opportunities of Speech Recognition for Bengali Language
Challenges and Opportunities of Speech Recognition for Bengali Language
M. F. Mridha
Abu Quwsar Ohi
Md. Abdul Hamid
M. Monowar
109
7
0
27 Sep 2021
DeepStroke: An Efficient Stroke Screening Framework for Emergency Rooms
  with Multimodal Adversarial Deep Learning
DeepStroke: An Efficient Stroke Screening Framework for Emergency Rooms with Multimodal Adversarial Deep Learning
Tongan Cai
Haomiao Ni
Ming-Chieh Yu
Xiaolei Huang
K. Wong
John Volpi
Chao Guo
Stephen T. C. Wong
158
26
0
24 Sep 2021
KOHTD: Kazakh Offline Handwritten Text Dataset
KOHTD: Kazakh Offline Handwritten Text DatasetSignal processing. Image communication (SPIC), 2021
N. Toiganbayeva
M. Kasem
Galymzhan Abdimanap
K. Bostanbekov
Abdelrahman Abdallah
Anel N. Alimova
D. Nurseitov
207
29
0
22 Sep 2021
Live Speech Portraits: Real-Time Photorealistic Talking-Head Animation
Live Speech Portraits: Real-Time Photorealistic Talking-Head AnimationACM Transactions on Graphics (TOG), 2021
Yuanxun Lu
Jinxiang Chai
Xun Cao
205
97
0
22 Sep 2021
Reliable Neural Networks for Regression Uncertainty Estimation
Reliable Neural Networks for Regression Uncertainty Estimation
Tony Tohme
Kevin Vanslette
K. Youcef-Toumi
UQCVBDL
224
16
0
16 Sep 2021
Unsupervised Domain Adaptation Schemes for Building ASR in Low-resource
  Languages
Unsupervised Domain Adaptation Schemes for Building ASR in Low-resource Languages
A. C. S.
Prathosh A P
A. G. Ramakrishnan
203
16
0
12 Sep 2021
Learning Visual-Audio Representations for Voice-Controlled Robots
Learning Visual-Audio Representations for Voice-Controlled RobotsIEEE International Conference on Robotics and Automation (ICRA), 2021
Peixin Chang
Shuijing Liu
D. L. McPherson
Katherine Driggs-Campbell
SSL
237
8
0
07 Sep 2021
SEC4SR: A Security Analysis Platform for Speaker Recognition
SEC4SR: A Security Analysis Platform for Speaker Recognition
Guangke Chen
Zhe Zhao
Fu Song
Sen Chen
Lingling Fan
Yang Liu
AAML
169
13
0
04 Sep 2021
Efficient conformer: Progressive downsampling and grouped attention for
  automatic speech recognition
Efficient conformer: Progressive downsampling and grouped attention for automatic speech recognitionAutomatic Speech Recognition & Understanding (ASRU), 2021
Maxime Burchi
Valentin Vielzeuf
185
101
0
31 Aug 2021
Adversarial Example Devastation and Detection on Speech Recognition
  System by Adding Random Noise
Adversarial Example Devastation and Detection on Speech Recognition System by Adding Random NoiseJournal of The Audio Engineering Society (JAES), 2021
Mingyu Dong
Diqun Yan
Yongkang Gong
Rangding Wang
AAML
196
2
0
31 Aug 2021
Investigating Vulnerabilities of Deep Neural Policies
Investigating Vulnerabilities of Deep Neural PoliciesConference on Uncertainty in Artificial Intelligence (UAI), 2021
Ezgi Korkmaz
AAML
138
38
0
30 Aug 2021
Automatic Speech Recognition And Limited Vocabulary: A Survey
Automatic Speech Recognition And Limited Vocabulary: A Survey
J. L. E. K. Fendji
D. Tala
B. Yenke
M. Atemkeng
263
3
0
23 Aug 2021
FACIAL: Synthesizing Dynamic Talking Face with Implicit Attribute
  Learning
FACIAL: Synthesizing Dynamic Talking Face with Implicit Attribute Learning
Chenxu Zhang
Yifan Zhao
Yifei Huang
Ming Zeng
Saifeng Ni
M. Budagavi
Xiaohu Guo
CVBM
166
143
0
18 Aug 2021
Detecting OODs as datapoints with High Uncertainty
Detecting OODs as datapoints with High Uncertainty
R. Kaur
Susmit Jha
Anirban Roy
Sangdon Park
O. Sokolsky
Insup Lee
AAMLUQCV
128
15
0
13 Aug 2021
SpecMix : A Mixed Sample Data Augmentation method for Training
  withTime-Frequency Domain Features
SpecMix : A Mixed Sample Data Augmentation method for Training withTime-Frequency Domain FeaturesInterspeech (Interspeech), 2021
Gwantae Kim
D. Han
Hanseok Ko
138
59
0
06 Aug 2021
Dyn-ASR: Compact, Multilingual Speech Recognition via Spoken Language
  and Accent Identification
Dyn-ASR: Compact, Multilingual Speech Recognition via Spoken Language and Accent IdentificationWorld Forum on Internet of Things (WF-IoT), 2021
Sangeeta Ghangam
Daniel Whitenack
Joshua Nemecek
94
4
0
04 Aug 2021
A Study of Multilingual End-to-End Speech Recognition for Kazakh,
  Russian, and English
A Study of Multilingual End-to-End Speech Recognition for Kazakh, Russian, and English
Saida Mussakhojayeva
Yerbolat Khassanov
H. A. Varol
153
19
0
03 Aug 2021
The History of Speech Recognition to the Year 2030
The History of Speech Recognition to the Year 2030
Awni Y. Hannun
AI4TS
229
24
0
30 Jul 2021
CarneliNet: Neural Mixture Model for Automatic Speech Recognition
CarneliNet: Neural Mixture Model for Automatic Speech Recognition
A. Kalinov
Somshubra Majumdar
Jagadeesh Balam
Boris Ginsburg
MoE
105
3
0
22 Jul 2021
Trustworthy AI: A Computational Perspective
Trustworthy AI: A Computational Perspective
Haochen Liu
Yiqi Wang
Wenqi Fan
Xiaorui Liu
Yaxin Li
Shaili Jain
Yunhao Liu
Anil K. Jain
Shucheng Zhou
FaML
412
258
0
12 Jul 2021
End-to-End Rich Transcription-Style Automatic Speech Recognition with
  Semi-Supervised Learning
End-to-End Rich Transcription-Style Automatic Speech Recognition with Semi-Supervised Learning
Tomohiro Tanaka
Ryo Masumura
Mana Ihori
Akihiko Takashima
Shota Orihashi
Naoki Makishima
128
4
0
07 Jul 2021
A Survey on Data Augmentation for Text Classification
A Survey on Data Augmentation for Text Classification
Markus Bayer
M. Kaufhold
Christian A. Reuter
456
426
0
07 Jul 2021
Egocentric Videoconferencing
Egocentric Videoconferencing
Mohamed A. Elgharib
Mohit Mendiratta
Justus Thies
Matthias Nießner
Hans-Peter Seidel
A. Tewari
Vladislav Golyanik
Christian Theobalt
EgoV
132
17
0
07 Jul 2021
Previous
123...567...141516
Next