Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1512.02595
Cited By
Deep Speech 2: End-to-End Speech Recognition in English and Mandarin
8 December 2015
Dario Amodei
Rishita Anubhai
Eric Battenberg
Carl Case
Jared Casper
Bryan Catanzaro
Jingdong Chen
Mike Chrzanowski
Adam Coates
G. Diamos
Erich Elsen
Jesse Engel
Linxi Fan
Christopher Fougner
T. Han
Awni Y. Hannun
Billy Jun
P. LeGresley
Libby Lin
Sharan Narang
A. Ng
Sherjil Ozair
R. Prenger
Jonathan Raiman
S. Satheesh
David Seetapun
Shubho Sengupta
Yi Wang
Zhiqian Wang
Chong-Jun Wang
Bo Xiao
Dani Yogatama
J. Zhan
Zhenyao Zhu
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Deep Speech 2: End-to-End Speech Recognition in English and Mandarin"
50 / 1,096 papers shown
Title
Debiasing, calibrating, and improving Semi-supervised Learning performance via simple Ensemble Projector
IEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2023
Khanh-Binh Nguyen
128
7
0
24 Oct 2023
Improved Contextual Recognition In Automatic Speech Recognition Systems By Semantic Lattice Rescoring
Ankitha Sudarshan
Vinay Samuel
Parth Patwa
Ibtihel Amara
Vasu Sharma
279
3
0
14 Oct 2023
Whispering LLaMA: A Cross-Modal Generative Error Correction Framework for Speech Recognition
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
S. Radhakrishnan
Chao-Han Huck Yang
S. Khan
Rohit Kumar
N. Kiani
D. Gómez-Cabrero
Jesper N. Tegnér
322
77
0
10 Oct 2023
FedLPA: One-shot Federated Learning with Layer-Wise Posterior Aggregation
Neural Information Processing Systems (NeurIPS), 2023
Xiang Liu
Liangxi Liu
Feiyang Ye
Yunheng Shen
Xia Li
Linshan Jiang
Jialin Li
412
13
0
30 Sep 2023
Developing automatic verbatim transcripts for international multilingual meetings: an end-to-end solution
Machine Translation Summit (MT Summit), 2023
Akshat Dewan
Michal Ziemski
Henri Meylan
Lorenzo Concina
Bruno Pouliquen
113
1
0
27 Sep 2023
Visual Speech Recognition for Languages with Limited Labeled Data using Automatic Labels from Whisper
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Jeong Hun Yeo
Minsu Kim
Shinji Watanabe
Y. Ro
VLM
185
16
0
15 Sep 2023
DiffTalker: Co-driven audio-image diffusion for talking faces via intermediate landmarks
Zipeng Qi
Xulong Zhang
Ning Cheng
Jing Xiao
Jianzong Wang
187
9
0
14 Sep 2023
PhantomSound: Black-Box, Query-Efficient Audio Adversarial Attack via Split-Second Phoneme Injection
International Symposium on Recent Advances in Intrusion Detection (RAID), 2023
Hanqing Guo
Guangjing Wang
Yuanda Wang
Bocheng Chen
Qiben Yan
Li Xiao
AAML
187
12
0
13 Sep 2023
Hybrid ASR for Resource-Constrained Robots: HMM - Deep Learning Fusion
Anshul Ranjan
Kaushik Jegadeesan
53
0
0
11 Sep 2023
Efficient Emotional Adaptation for Audio-Driven Talking-Head Generation
IEEE International Conference on Computer Vision (ICCV), 2023
Yuan Gan
Zongxin Yang
Xihang Yue
Lingyun Sun
Yezhou Yang
199
91
0
10 Sep 2023
ReliTalk: Relightable Talking Portrait Generation from a Single Video
International Journal of Computer Vision (IJCV), 2023
Haonan Qiu
Zhaoxi Chen
Yuming Jiang
Hang Zhou
Xiangyu Fan
Lei Yang
Wayne Wu
Ziwei Liu
DiffM
VGen
202
14
0
05 Sep 2023
Homological Convolutional Neural Networks
Antonio Briola
Yuanrong Wang
Silvia Bartolucci
T. Aste
LMTD
219
7
0
26 Aug 2023
Throughput Maximization of DNN Inference: Batching or Multi-Tenancy?
Seyed Morteza Nabavinejad
M. Ebrahimi
Sherief Reda
154
1
0
26 Aug 2023
Improving Continuous Sign Language Recognition with Cross-Lingual Signs
IEEE International Conference on Computer Vision (ICCV), 2023
Fangyun Wei
Yutong Chen
SLR
172
38
0
21 Aug 2023
Boosting Semi-Supervised Learning by bridging high and low-confidence predictions
Khanh-Binh Nguyen
Joon-Sung Yang
187
19
0
15 Aug 2023
Cross-Attribute Matrix Factorization Model with Shared User Embedding
Wen-Chieh Liang
Zeng Fan
Youzhi Liang
Jianguo Jia
99
3
0
14 Aug 2023
Automated Sizing and Training of Efficient Deep Autoencoders using Second Order Algorithms
Kanishka Tyagi
Chinmay Rane
M. Manry
127
1
0
11 Aug 2023
Speech-Driven 3D Face Animation with Composite and Regional Facial Movements
ACM Multimedia (ACM MM), 2023
Haozhe Wu
Songtao Zhou
Jia Jia
Junliang Xing
Qi Wen
Xiang Wen
CVBM
216
21
0
10 Aug 2023
Personalization of Stress Mobile Sensing using Self-Supervised Learning
Tanvir Islam
Peter Washington
114
7
0
04 Aug 2023
Many-to-Many Spoken Language Translation via Unified Speech and Text Representation Learning with Unit-to-Unit Translation
IEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023
Minsu Kim
J. Choi
Dahun Kim
Y. Ro
174
10
0
03 Aug 2023
Mercury: An Automated Remote Side-channel Attack to Nvidia Deep Learning Accelerator
International Conference on Field-Programmable Technology (ICFPT), 2023
Xi-ai Yan
Xiaoxuan Lou
Guowen Xu
Han Qiu
Shangwei Guo
Chip Hong Chang
Tianwei Zhang
AAML
114
9
0
02 Aug 2023
Inaudible Adversarial Perturbation: Manipulating the Recognition of User Speech in Real Time
Network and Distributed System Security Symposium (NDSS), 2023
Xinfeng Li
Chen Yan
Xuancun Lu
Zihan Zeng
Xiaoyu Ji
Wei Dong
AAML
139
15
0
02 Aug 2023
Learning Multi-modal Representations by Watching Hundreds of Surgical Video Lectures
Kun Yuan
V. Srivastav
Tong Yu
Joël L. Lavanchy
J. Marescaux
Pietro Mascagni
Nassir Navab
N. Padoy
615
44
0
27 Jul 2023
Integration of Frame- and Label-synchronous Beam Search for Streaming Encoder-decoder Speech Recognition
Interspeech (Interspeech), 2023
E. Tsunoo
Hayato Futami
Yosuke Kashiwagi
Siddhant Arora
Shinji Watanabe
172
4
0
24 Jul 2023
TST: Time-Sparse Transducer for Automatic Speech Recognition
CAAI International Conference on Artificial Intelligence (ICCAI), 2023
Xiaohui Zhang
Mangui Liang
Zhengkun Tian
Jiangyan Yi
Jianhua Tao
101
0
0
17 Jul 2023
Ed-Fed: A generic federated learning framework with resource-aware client selection for edge devices
IEEE International Joint Conference on Neural Network (IJCNN), 2023
Zitha Sasindran
Harsha Yelchuri
T. V. Prabhakar
FedML
221
5
0
14 Jul 2023
Can Generative Large Language Models Perform ASR Error Correction?
Rao Ma
Mengjie Qian
Potsawee Manakul
Mark Gales
Kate Knill
AuLLM
KELM
246
73
0
09 Jul 2023
Personalized Prediction of Recurrent Stress Events Using Self-Supervised Learning on Multimodal Time-Series Data
Tanvir Islam
Peter Washington
114
12
0
07 Jul 2023
Boosting Norwegian Automatic Speech Recognition
Nordic Conference of Computational Linguistics (NODALIDA), 2023
Javier de la Rosa
Rolv-Arild Braaten
P. Kummervold
Freddy Wetjen
Svein Arne Brygfjeld
187
8
0
04 Jul 2023
Beyond Neural-on-Neural Approaches to Speaker Gender Protection
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
L. V. Bemmel
Zhuoran Liu
Nik Vaessen
Martha Larson
AAML
93
2
0
30 Jun 2023
SURT 2.0: Advances in Transducer-based Multi-talker Speech Recognition
IEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023
Desh Raj
Daniel Povey
Sanjeev Khudanpur
VLM
298
16
0
18 Jun 2023
MobileASR: A resource-aware on-device learning framework for user voice personalization applications on mobile phones
International Conference on AI-ML-Systems (ICA), 2023
Zitha Sasindran
Harsha Yelchuri
Pooja S B. Rao
Prabhakar Venkata Tamma
158
1
0
15 Jun 2023
DCTX-Conformer: Dynamic context carry-over for low latency unified streaming and non-streaming Conformer ASR
Interspeech (Interspeech), 2023
Goeric Huybrechts
S. Ronanki
Xilai Li
H. Nosrati
S. Bodapati
Katrin Kirchhoff
156
2
0
13 Jun 2023
What Can an Accent Identifier Learn? Probing Phonetic and Prosodic Information in a Wav2vec2-based Accent Identification Model
Interspeech (Interspeech), 2023
Mu Yang
R. Shekar
Okim Kang
John H. L. Hansen
272
24
0
10 Jun 2023
Arabic Dysarthric Speech Recognition Using Adversarial and Signal-Based Augmentation
Interspeech (Interspeech), 2023
Massa Baali
Ibrahim Almakky
Shady Shehata
Fakhri Karray
167
4
0
07 Jun 2023
End-to-End Learning for Stochastic Optimization: A Bayesian Perspective
International Conference on Machine Learning (ICML), 2023
Yves Rychener
Daniel Kuhn
Tobias Sutter
OOD
BDL
131
12
0
07 Jun 2023
Looking and Listening: Audio Guided Text Recognition
Wenwen Yu
Mingyu Liu
Biao Yang
Enming Zhang
Deqiang Jiang
Xing Sun
Yuliang Liu
Xiang Bai
DiffM
131
1
0
06 Jun 2023
Efficient Spoken Language Recognition via Multilabel Classification
Interspeech (Interspeech), 2023
Oriol Nieto
Zeyu Jin
Franck Dernoncourt
Justin Salamon
93
2
0
02 Jun 2023
Trustworthy Sensor Fusion against Inaudible Command Attacks in Advanced Driver-Assistance System
IEEE Internet of Things Journal (IEEE IoT J.), 2023
Jiwei Guan
Lei Pan
Chen Wang
Shui Yu
Longxiang Gao
Xi Zheng
AAML
168
5
0
30 May 2023
Downstream Task Agnostic Speech Enhancement with Self-Supervised Representation Loss
Interspeech (Interspeech), 2023
Hiroshi Sato
Ryo Masumura
Tsubasa Ochiai
Marc Delcroix
Takafumi Moriya
...
Kentaro Shinayama
Saki Mizuno
Mana Ihori
Tomohiro Tanaka
Nobukatsu Hojo
171
7
0
24 May 2023
Wav2SQL: Direct Generalizable Speech-To-SQL Parsing
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Huadai Liu
Rongjie Huang
Jinzheng He
Gang Sun
Ran Shen
Xize Cheng
Zhou Zhao
198
5
0
21 May 2023
DoReMi: Optimizing Data Mixtures Speeds Up Language Model Pretraining
Neural Information Processing Systems (NeurIPS), 2023
Sang Michael Xie
Hieu H. Pham
Xuanyi Dong
Nan Du
Hanxiao Liu
Yifeng Lu
Abigail Z. Jacobs
Quoc V. Le
Tengyu Ma
Adams Wei Yu
MoMe
MoE
496
274
0
17 May 2023
Value Iteration Networks with Gated Summarization Module
IEEE Access (IEEE Access), 2023
Jinyu Cai
Jialong Li
Mingyue Zhang
Kenji Tei
116
3
0
11 May 2023
Quran Recitation Recognition using End-to-End Deep Learning
Ahmad Al Harere
Khloud Al Jallad
182
13
0
10 May 2023
SoK: Pragmatic Assessment of Machine Learning for Network Intrusion Detection
European Symposium on Security and Privacy (Euro S&P), 2023
Giovanni Apruzzese
Pavel Laskov
J. Schneider
220
40
0
30 Apr 2023
Enhancing multilingual speech recognition in air traffic control by sentence-level language identification
Applied Acoustics (Appl. Acoust.), 2023
Peng Fan
Dongyue Guo
Jianwei Zhang
Bo Yang
Yi Lin
158
9
0
29 Apr 2023
AVFace: Towards Detailed Audio-Visual 4D Face Reconstruction
Computer Vision and Pattern Recognition (CVPR), 2023
Aggelina Chatziagapi
Dimitris Samaras
3DH
CVBM
159
5
0
25 Apr 2023
Dynamic Chunk Convolution for Unified Streaming and Non-Streaming Conformer ASR
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Xilai Li
Goeric Huybrechts
S. Ronanki
Jeffrey J. Farris
S. Bodapati
170
13
0
18 Apr 2023
Energy-Efficient GPU Clusters Scheduling for Deep Learning
Diandian Gu
Xintong Xie
Gang Huang
Xin Jin
Xuanzhe Liu
GNN
188
8
0
13 Apr 2023
Wav2code: Restore Clean Speech Representations via Codebook Lookup for Noise-Robust ASR
IEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023
Yuchen Hu
Cheng Chen
Qiu-shi Zhu
Eng Siong Chng
285
17
0
11 Apr 2023
Previous
1
2
3
4
5
6
...
20
21
22
Next