Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1412.5567
Cited By
v1
v2 (latest)
Deep Speech: Scaling up end-to-end speech recognition
17 December 2014
Awni Y. Hannun
Carl Case
Jared Casper
Bryan Catanzaro
G. Diamos
Erich Elsen
R. Prenger
S. Satheesh
Shubho Sengupta
Adam Coates
A. Ng
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Deep Speech: Scaling up end-to-end speech recognition"
50 / 768 papers shown
Real-Time Neural Voice Camouflage
Mia Chiquier
Chengzhi Mao
Carl Vondrick
195
8
0
14 Dec 2021
Detecting Audio Adversarial Examples with Logit Noising
N. Park
Sangwoo Ji
Jong Kim
AAML
184
5
0
13 Dec 2021
Finding Deviated Behaviors of the Compressed DNN Models for Image Classifications
ACM Transactions on Software Engineering and Methodology (TOSEM), 2021
Yongqiang Tian
Wuqi Zhang
Ming Wen
Shing-Chi Cheung
Chengnian Sun
Shiqing Ma
Yu Jiang
230
9
0
06 Dec 2021
Joint Audio-Text Model for Expressive Speech-Driven 3D Facial Animation
Proceedings of the ACM on Computer Graphics and Interactive Techniques (PACMCGIT), 2021
Yingruo Fan
Mohammad Kachuee
Jun Saito
Wenping Wang
Taku Komura
175
27
0
04 Dec 2021
Catch Me If You Can: Blackbox Adversarial Attacks on Automatic Speech Recognition using Frequency Masking
Xiao-lan Wu
A. Rajan
AAML
233
7
0
03 Dec 2021
Transformer-S2A: Robust and Efficient Speech-to-Animation
Liyang Chen
Zhiyong Wu
Jun Ling
Runnan Li
Xu Tan
Sheng Zhao
216
19
0
18 Nov 2021
A Survey on Adversarial Attacks for Malware Analysis
IEEE Access (IEEE Access), 2021
Kshitiz Aryal
Maanak Gupta
Mahmoud Abdelsalam
AAML
302
65
0
16 Nov 2021
Neural Population Geometry Reveals the Role of Stochasticity in Robust Perception
Neural Information Processing Systems (NeurIPS), 2021
Joel Dapello
J. Feather
Hang Le
Tiago Marques
David D. Cox
Josh H. McDermott
J. DiCarlo
SueYeon Chung
AAML
OOD
132
26
0
12 Nov 2021
Recent Advances in End-to-End Automatic Speech Recognition
APSIPA Transactions on Signal and Information Processing (TASIP), 2021
Jinyu Li
VLM
431
427
0
02 Nov 2021
With a Little Help from my Temporal Context: Multimodal Egocentric Action Recognition
British Machine Vision Conference (BMVC), 2021
Evangelos Kazakos
Jaesung Huh
Arsha Nagrani
Andrew Zisserman
Dima Damen
EgoV
297
54
0
01 Nov 2021
Imitating Arbitrary Talking Style for Realistic Audio-DrivenTalking Face Synthesis
ACM Multimedia (ACM MM), 2021
Haozhe Wu
Jia Jia
Haoyu Wang
Yishun Dou
Chao Duan
Qingshan Deng
CVBM
183
83
0
30 Oct 2021
TorchAudio: Building Blocks for Audio and Speech Processing
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021
Yao-Yuan Yang
Moto Hira
Zhaoheng Ni
Anjali Chourdia
Artyom Astafurov
...
Mehrzad Samadi
Shinji Watanabe
Soumith Chintala
Vincent Quenneville-Bélair
Yangyang Shi
170
190
0
28 Oct 2021
Beyond
L
p
L_p
L
p
clipping: Equalization-based Psychoacoustic Attacks against ASRs
H. Abdullah
Muhammad Sajidur Rahman
Christian Peeters
Cassidy Gibson
Washington Garcia
Vincent Bindschaedler
T. Shrimpton
Patrick Traynor
AAML
99
12
0
25 Oct 2021
Deep Neural Networks on EEG Signals to Predict Auditory Attention Score Using Gramian Angular Difference Field
Mahak Kothari
Shreyansh Joshi
Adarsh Nandanwar
Aadetya Jaiswal
V. Baths
72
1
0
24 Oct 2021
Asynchronous Decentralized Distributed Training of Acoustic Models
IEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2021
Xiaodong Cui
Wei Zhang
Abdullah Kayi
Mingrui Liu
Ulrich Finkler
Brian Kingsbury
G. Saon
David S. Kung
126
3
0
21 Oct 2021
Activation Landscapes as a Topological Summary of Neural Network Performance
Matthew Wheeler
Jose J. Bouza
Peter Bubenik
172
22
0
19 Oct 2021
Speech Pattern based Black-box Model Watermarking for Automatic Speech Recognition
Haozhe Chen
Weiming Zhang
Kunlin Liu
Kejiang Chen
Han Fang
Nenghai Yu
94
4
0
19 Oct 2021
Black-box Adversarial Attacks on Commercial Speech Platforms with Minimal Information
Baolin Zheng
Peipei Jiang
Qian Wang
Qi Li
Chao Shen
Cong Wang
Yunjie Ge
Qingyang Teng
Shenyi Zhang
AAML
146
87
0
19 Oct 2021
Intent Classification Using Pre-trained Language Agnostic Embeddings For Low Resource Languages
Hemant Yadav
Akshat Gupta
Sai Krishna Rallabandi
A. Black
R. Shah
93
1
0
18 Oct 2021
Towards Robust Waveform-Based Acoustic Models
Dino Oglic
Zoran Cvetkovic
Peter Sollich
Steve Renals
Bin Yu
OOD
AAML
206
4
0
16 Oct 2021
On Language Model Integration for RNN Transducer based Speech Recognition
Wei Zhou
Zuoyun Zheng
Ralf Schluter
Hermann Ney
265
27
0
13 Oct 2021
Synergy: Resource Sensitive DNN Scheduling in Multi-Tenant Clusters
Jayashree Mohan
Amar Phanishayee
Janardhan Kulkarni
Vijay Chidambaram
GNN
239
8
0
12 Oct 2021
Automated Testing of AI Models
Swagatam Haldar
Deepak Vijaykeerthy
Diptikalyan Saha
VLM
114
0
0
07 Oct 2021
Internal Language Model Adaptation with Text-Only Data for End-to-End Speech Recognition
Zhong Meng
Yashesh Gaur
Naoyuki Kanda
Jinyu Li
Xie Chen
Yu Wu
Yifan Gong
AuLLM
220
34
0
06 Oct 2021
Building a Noisy Audio Dataset to Evaluate Machine Learning Approaches for Automatic Speech Recognition Systems
J. C. Duarte
S. Colcher
54
4
0
04 Oct 2021
Anti-aliasing Deep Image Classifiers using Novel Depth Adaptive Blurring and Activation Function
Md Tahmid Hossain
S. Teng
Ferdous Sohel
Guojun Lu
165
18
0
03 Oct 2021
SpliceOut: A Simple and Efficient Audio Augmentation Method
Arjit Jain
Pranay Reddy Samala
Deepak Mittal
Preethi Jyothi
M. Singh
451
13
0
30 Sep 2021
Challenges and Opportunities of Speech Recognition for Bengali Language
M. F. Mridha
Abu Quwsar Ohi
Md. Abdul Hamid
M. Monowar
109
7
0
27 Sep 2021
DeepStroke: An Efficient Stroke Screening Framework for Emergency Rooms with Multimodal Adversarial Deep Learning
Tongan Cai
Haomiao Ni
Ming-Chieh Yu
Xiaolei Huang
K. Wong
John Volpi
Chao Guo
Stephen T. C. Wong
158
26
0
24 Sep 2021
KOHTD: Kazakh Offline Handwritten Text Dataset
Signal processing. Image communication (SPIC), 2021
N. Toiganbayeva
M. Kasem
Galymzhan Abdimanap
K. Bostanbekov
Abdelrahman Abdallah
Anel N. Alimova
D. Nurseitov
207
29
0
22 Sep 2021
Live Speech Portraits: Real-Time Photorealistic Talking-Head Animation
ACM Transactions on Graphics (TOG), 2021
Yuanxun Lu
Jinxiang Chai
Xun Cao
205
97
0
22 Sep 2021
Reliable Neural Networks for Regression Uncertainty Estimation
Tony Tohme
Kevin Vanslette
K. Youcef-Toumi
UQCV
BDL
224
16
0
16 Sep 2021
Unsupervised Domain Adaptation Schemes for Building ASR in Low-resource Languages
A. C. S.
Prathosh A P
A. G. Ramakrishnan
203
16
0
12 Sep 2021
Learning Visual-Audio Representations for Voice-Controlled Robots
IEEE International Conference on Robotics and Automation (ICRA), 2021
Peixin Chang
Shuijing Liu
D. L. McPherson
Katherine Driggs-Campbell
SSL
237
8
0
07 Sep 2021
SEC4SR: A Security Analysis Platform for Speaker Recognition
Guangke Chen
Zhe Zhao
Fu Song
Sen Chen
Lingling Fan
Yang Liu
AAML
169
13
0
04 Sep 2021
Efficient conformer: Progressive downsampling and grouped attention for automatic speech recognition
Automatic Speech Recognition & Understanding (ASRU), 2021
Maxime Burchi
Valentin Vielzeuf
185
101
0
31 Aug 2021
Adversarial Example Devastation and Detection on Speech Recognition System by Adding Random Noise
Journal of The Audio Engineering Society (JAES), 2021
Mingyu Dong
Diqun Yan
Yongkang Gong
Rangding Wang
AAML
196
2
0
31 Aug 2021
Investigating Vulnerabilities of Deep Neural Policies
Conference on Uncertainty in Artificial Intelligence (UAI), 2021
Ezgi Korkmaz
AAML
138
38
0
30 Aug 2021
Automatic Speech Recognition And Limited Vocabulary: A Survey
J. L. E. K. Fendji
D. Tala
B. Yenke
M. Atemkeng
263
3
0
23 Aug 2021
FACIAL: Synthesizing Dynamic Talking Face with Implicit Attribute Learning
Chenxu Zhang
Yifan Zhao
Yifei Huang
Ming Zeng
Saifeng Ni
M. Budagavi
Xiaohu Guo
CVBM
166
143
0
18 Aug 2021
Detecting OODs as datapoints with High Uncertainty
R. Kaur
Susmit Jha
Anirban Roy
Sangdon Park
O. Sokolsky
Insup Lee
AAML
UQCV
128
15
0
13 Aug 2021
SpecMix : A Mixed Sample Data Augmentation method for Training withTime-Frequency Domain Features
Interspeech (Interspeech), 2021
Gwantae Kim
D. Han
Hanseok Ko
138
59
0
06 Aug 2021
Dyn-ASR: Compact, Multilingual Speech Recognition via Spoken Language and Accent Identification
World Forum on Internet of Things (WF-IoT), 2021
Sangeeta Ghangam
Daniel Whitenack
Joshua Nemecek
94
4
0
04 Aug 2021
A Study of Multilingual End-to-End Speech Recognition for Kazakh, Russian, and English
Saida Mussakhojayeva
Yerbolat Khassanov
H. A. Varol
153
19
0
03 Aug 2021
The History of Speech Recognition to the Year 2030
Awni Y. Hannun
AI4TS
229
24
0
30 Jul 2021
CarneliNet: Neural Mixture Model for Automatic Speech Recognition
A. Kalinov
Somshubra Majumdar
Jagadeesh Balam
Boris Ginsburg
MoE
105
3
0
22 Jul 2021
Trustworthy AI: A Computational Perspective
Haochen Liu
Yiqi Wang
Wenqi Fan
Xiaorui Liu
Yaxin Li
Shaili Jain
Yunhao Liu
Anil K. Jain
Shucheng Zhou
FaML
412
258
0
12 Jul 2021
End-to-End Rich Transcription-Style Automatic Speech Recognition with Semi-Supervised Learning
Tomohiro Tanaka
Ryo Masumura
Mana Ihori
Akihiko Takashima
Shota Orihashi
Naoki Makishima
128
4
0
07 Jul 2021
A Survey on Data Augmentation for Text Classification
Markus Bayer
M. Kaufhold
Christian A. Reuter
456
426
0
07 Jul 2021
Egocentric Videoconferencing
Mohamed A. Elgharib
Mohit Mendiratta
Justus Thies
Matthias Nießner
Hans-Peter Seidel
A. Tewari
Vladislav Golyanik
Christian Theobalt
EgoV
132
17
0
07 Jul 2021
Previous
1
2
3
...
5
6
7
...
14
15
16
Next