Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1512.02595
Cited By
Deep Speech 2: End-to-End Speech Recognition in English and Mandarin
8 December 2015
Dario Amodei
Rishita Anubhai
Eric Battenberg
Carl Case
Jared Casper
Bryan Catanzaro
Jingdong Chen
Mike Chrzanowski
Adam Coates
G. Diamos
Erich Elsen
Jesse Engel
Linxi Fan
Christopher Fougner
T. Han
Awni Y. Hannun
Billy Jun
P. LeGresley
Libby Lin
Sharan Narang
A. Ng
Sherjil Ozair
R. Prenger
Jonathan Raiman
S. Satheesh
David Seetapun
Shubho Sengupta
Yi Wang
Zhiqian Wang
Chong-Jun Wang
Bo Xiao
Dani Yogatama
J. Zhan
Zhenyao Zhu
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Deep Speech 2: End-to-End Speech Recognition in English and Mandarin"
50 / 931 papers shown
Title
SURT 2.0: Advances in Transducer-based Multi-talker Speech Recognition
Desh Raj
Daniel Povey
Sanjeev Khudanpur
VLM
34
9
0
18 Jun 2023
MobileASR: A resource-aware on-device learning framework for user voice personalization applications on mobile phones
Zitha Sasindran
Harsha Yelchuri
Pooja S B. Rao
Prabhakar Venkata Tamma
17
1
0
15 Jun 2023
DCTX-Conformer: Dynamic context carry-over for low latency unified streaming and non-streaming Conformer ASR
Goeric Huybrechts
S. Ronanki
Xilai Li
H. Nosrati
S. Bodapati
Katrin Kirchhoff
20
1
0
13 Jun 2023
What Can an Accent Identifier Learn? Probing Phonetic and Prosodic Information in a Wav2vec2-based Accent Identification Model
Mu Yang
R. Shekar
Okim Kang
John H. L. Hansen
23
11
0
10 Jun 2023
Arabic Dysarthric Speech Recognition Using Adversarial and Signal-Based Augmentation
Massa Baali
Ibrahim Almakky
Shady Shehata
Fakhri Karray
37
1
0
07 Jun 2023
End-to-End Learning for Stochastic Optimization: A Bayesian Perspective
Yves Rychener
Daniel Kuhn
Tobias Sutter
OOD
BDL
36
10
0
07 Jun 2023
Looking and Listening: Audio Guided Text Recognition
Wenwen Yu
Mingyu Liu
Biao Yang
Enming Zhang
Deqiang Jiang
Xing Sun
Yuliang Liu
Xiang Bai
DiffM
27
1
0
06 Jun 2023
Efficient Spoken Language Recognition via Multilabel Classification
Oriol Nieto
Zeyu Jin
Franck Dernoncourt
Justin Salamon
18
1
0
02 Jun 2023
Trustworthy Sensor Fusion against Inaudible Command Attacks in Advanced Driver-Assistance System
Jiwei Guan
Lei Pan
Chen Wang
Shui Yu
Longxiang Gao
Xi Zheng
AAML
19
3
0
30 May 2023
Downstream Task Agnostic Speech Enhancement with Self-Supervised Representation Loss
Hiroshi Sato
Ryo Masumura
Tsubasa Ochiai
Marc Delcroix
Takafumi Moriya
...
Kentaro Shinayama
Saki Mizuno
Mana Ihori
Tomohiro Tanaka
Nobukatsu Hojo
37
5
0
24 May 2023
Wav2SQL: Direct Generalizable Speech-To-SQL Parsing
Huadai Liu
Rongjie Huang
Jinzheng He
Gang Sun
Ran Shen
Xize Cheng
Zhou Zhao
31
3
0
21 May 2023
DoReMi: Optimizing Data Mixtures Speeds Up Language Model Pretraining
Sang Michael Xie
Hieu H. Pham
Xuanyi Dong
Nan Du
Hanxiao Liu
Yifeng Lu
Percy Liang
Quoc V. Le
Tengyu Ma
Adams Wei Yu
MoMe
MoE
56
178
0
17 May 2023
Value Iteration Networks with Gated Summarization Module
Jinyu Cai
Jialong Li
Mingyue Zhang
Kenji Tei
25
2
0
11 May 2023
Quran Recitation Recognition using End-to-End Deep Learning
Ahmad Al Harere
Khloud Al Jallad
38
6
0
10 May 2023
SoK: Pragmatic Assessment of Machine Learning for Network Intrusion Detection
Giovanni Apruzzese
Pavel Laskov
J. Schneider
44
25
0
30 Apr 2023
Enhancing multilingual speech recognition in air traffic control by sentence-level language identification
Peng Fan
Dongyue Guo
Jianwei Zhang
Bo Yang
Yi Lin
17
6
0
29 Apr 2023
AVFace: Towards Detailed Audio-Visual 4D Face Reconstruction
Aggelina Chatziagapi
Dimitris Samaras
3DH
CVBM
33
3
0
25 Apr 2023
Dynamic Chunk Convolution for Unified Streaming and Non-Streaming Conformer ASR
Xilai Li
Goeric Huybrechts
S. Ronanki
Jeffrey J. Farris
S. Bodapati
38
6
0
18 Apr 2023
Energy-Efficient GPU Clusters Scheduling for Deep Learning
Diandian Gu
Xintong Xie
Gang Huang
Xin Jin
Xuanzhe Liu
GNN
24
7
0
13 Apr 2023
Wav2code: Restore Clean Speech Representations via Codebook Lookup for Noise-Robust ASR
Yuchen Hu
Cheng Chen
Qiu-shi Zhu
Eng Siong Chng
22
15
0
11 Apr 2023
Pyramid Multi-branch Fusion DCNN with Multi-Head Self-Attention for Mandarin Speech Recognition
Kai Liu
Hailiang Xiong
Gangqiang Yang
Zhengfeng Du
Yewen Cao
D. Shah
18
0
0
23 Mar 2023
Bayesian Pseudo-Coresets via Contrastive Divergence
Piyush Tiwary
Kumar Shubham
V. Kashyap
Prathosh A.P.
21
3
0
20 Mar 2023
A Deep Learning System for Domain-specific Speech Recognition
Yanan Jia
14
2
0
18 Mar 2023
MMFace4D: A Large-Scale Multi-Modal 4D Face Dataset for Audio-Driven 3D Face Animation
Haozhe Wu
Jia Jia
Junliang Xing
Hongwei Xu
Xiangyuan Wang
Jelo Wang
CVBM
32
7
0
17 Mar 2023
Toward a Geometric Theory of Manifold Untangling
Xin Li
Shuo Wang
31
0
0
07 Mar 2023
Speech Modeling with a Hierarchical Transformer Dynamical VAE
Xiaoyu Lin
Xiaoyu Bie
Simon Leglaive
Laurent Girin
Xavier Alameda-Pineda
BDL
39
2
0
07 Mar 2023
End-to-End Speech Recognition: A Survey
Rohit Prabhavalkar
Takaaki Hori
Tara N. Sainath
Ralf Schluter
Shinji Watanabe
VLM
26
150
0
03 Mar 2023
LiteG2P: A fast, light and high accuracy model for grapheme-to-phoneme conversion
Chunfeng Wang
Peisong Huang
Yuxiang Zou
Haoyu Zhang
Shichao Liu
Xiang Yin
Zejun Ma
11
2
0
02 Mar 2023
Defending against Adversarial Audio via Diffusion Model
Shutong Wu
Jiong Wang
Ming-Yu Liu
Weili Nie
Chaowei Xiao
DiffM
40
25
0
02 Mar 2023
Improving Medical Speech-to-Text Accuracy with Vision-Language Pre-training Model
Jaeyoung Huh
Sangjoon Park
Jeonghyeon Lee
Jong Chul Ye
LM&MA
19
9
0
27 Feb 2023
Factual Consistency Oriented Speech Recognition
Naoyuki Kanda
Takuya Yoshioka
Yang Liu
43
0
0
24 Feb 2023
Speech Privacy Leakage from Shared Gradients in Distributed Learning
Zhuohang Li
Jiaxin Zhang
Jian-Dong Liu
FedML
32
1
0
21 Feb 2023
Emphasizing Unseen Words: New Vocabulary Acquisition for End-to-End Speech Recognition
Leyuan Qu
C. Weber
S. Wermter
11
5
0
20 Feb 2023
E2E Spoken Entity Extraction for Virtual Agents
Karan Singla
Yeon-Jun Kim
S. Bangalore
29
1
0
16 Feb 2023
Stabilising and accelerating light gated recurrent units for automatic speech recognition
Adel Moumen
Titouan Parcollet
28
3
0
16 Feb 2023
Policy-Induced Self-Supervision Improves Representation Finetuning in Visual RL
Sébastien M. R. Arnold
Fei Sha
SSL
21
0
0
12 Feb 2023
Improving Rare Words Recognition through Homophone Extension and Unified Writing for Low-resource Cantonese Speech Recognition
Ho-Lam Chung
Junan Li
Pengfei Liu1
Wai-Kim Leung
Xixin Wu
Helen Meng
38
3
0
02 Feb 2023
Open Problems in Applied Deep Learning
M. Raissi
AI4CE
42
2
0
26 Jan 2023
MooseNet: A Trainable Metric for Synthesized Speech with a PLDA Module
Ondvrej Plátek
Ondrej Dusek
31
2
0
17 Jan 2023
Dataset Distillation: A Comprehensive Review
Ruonan Yu
Songhua Liu
Xinchao Wang
DD
53
121
0
17 Jan 2023
Speech Driven Video Editing via an Audio-Conditioned Diffusion Model
Dan Bigioi
Shubhajit Basak
Michał Stypułkowski
Maciej Ziȩba
H. Jordan
R. Mcdonnell
Peter Corcoran
DiffM
VGen
24
35
0
10 Jan 2023
VSVC: Backdoor attack against Keyword Spotting based on Voiceprint Selection and Voice Conversion
Hanbo Cai
Pengcheng Zhang
Hai Dong
Yan Xiao
Shunhui Ji
18
5
0
20 Dec 2022
AirfRANS: High Fidelity Computational Fluid Dynamics Dataset for Approximating Reynolds-Averaged Navier-Stokes Solutions
F. Bonnet
Jocelyn Ahmed Mazari
Paola Cinnella
Patrick Gallinari
AI4CE
33
54
0
15 Dec 2022
Fully complex-valued deep learning model for visual perception
Aniruddh Sikdar
Sumanth Udupa
Suresh Sundaram
25
2
0
14 Dec 2022
An Exploratory Study of AI System Risk Assessment from the Lens of Data Distribution and Uncertainty
Zhijie Wang
Yuheng Huang
Lei Ma
Haruki Yokoyama
Susumu Tokumoto
Kazuki Munakata
29
4
0
13 Dec 2022
Memories are One-to-Many Mapping Alleviators in Talking Face Generation
Anni Tang
Tianyu He
Xuejiao Tan
Jun Ling
Liang Song
CVBM
26
23
0
09 Dec 2022
Progressive Multi-Scale Self-Supervised Learning for Speech Recognition
Genshun Wan
Tan Liu
Hang Chen
Jia-Yu Pan
Cong Liu
Z. Ye
SSL
18
0
0
07 Dec 2022
Learning the joint distribution of two sequences using little or no paired data
Soroosh Mariooryad
Matt Shannon
Siyuan Ma
Tom Bagby
David Kao
Daisy Stanton
Eric Battenberg
RJ Skerry-Ryan
27
2
0
06 Dec 2022
Robust Speech Recognition via Large-Scale Weak Supervision
Alec Radford
Jong Wook Kim
Tao Xu
Greg Brockman
C. McLeavey
Ilya Sutskever
OffRL
79
3,315
0
06 Dec 2022
Remote estimation of geologic composition using interferometric synthetic-aperture radar in California's Central Valley
Kyongsik Yun
Kyra H Adams
J. Reager
Zhen Liu
Caitlyn Chavez
M. Turmon
Thomas Lu
17
2
0
04 Dec 2022
Previous
1
2
3
4
5
6
...
17
18
19
Next