Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1512.02595
Cited By
Deep Speech 2: End-to-End Speech Recognition in English and Mandarin
8 December 2015
Dario Amodei
Rishita Anubhai
Eric Battenberg
Carl Case
Jared Casper
Bryan Catanzaro
Jingdong Chen
Mike Chrzanowski
Adam Coates
G. Diamos
Erich Elsen
Jesse Engel
Linxi Fan
Christopher Fougner
T. Han
Awni Y. Hannun
Billy Jun
P. LeGresley
Libby Lin
Sharan Narang
A. Ng
Sherjil Ozair
R. Prenger
Jonathan Raiman
S. Satheesh
David Seetapun
Shubho Sengupta
Yi Wang
Zhiqian Wang
Chong-Jun Wang
Bo Xiao
Dani Yogatama
J. Zhan
Zhenyao Zhu
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Deep Speech 2: End-to-End Speech Recognition in English and Mandarin"
50 / 931 papers shown
Title
Poem Meter Classification of Recited Arabic Poetry: Integrating High-Resource Systems for a Low-Resource Task
Maged S. Al-Shaibani
Zaid Alyafeai
Irfan Ahmad
46
0
0
16 Apr 2025
SE4Lip: Speech-Lip Encoder for Talking Head Synthesis to Solve Phoneme-Viseme Alignment Ambiguity
Yihuan Huang
Jiajun Liu
Yanzhen Ren
Wuyang Liu
Juhua Tang
24
0
0
08 Apr 2025
Whispering Under the Eaves: Protecting User Privacy Against Commercial and LLM-powered Automatic Speech Recognition Systems
Weifei Jin
Yuxin Cao
Junjie Su
Derui Wang
Yedi Zhang
Minhui Xue
Jie Hao
Jin Song Dong
Yixian Yang
AAML
57
0
0
01 Apr 2025
Whisper-LM: Improving ASR Models with Language Models for Low-Resource Languages
Xabier de Zuazo
Eva Navas
Ibon Saratxaga
Inma Hernáez Rioja
42
0
0
30 Mar 2025
Robust DNN Partitioning and Resource Allocation Under Uncertain Inference Time
Zhaojun Nan
Yunchu Han
Sheng Zhou
Zhisheng Niu
46
0
0
27 Mar 2025
Improving Speech Recognition Accuracy Using Custom Language Models with the Vosk Toolkit
Aniket Abhishek Soni
59
0
0
26 Mar 2025
Deep Learning for Forensic Identification of Source
Cole Patten
Christopher Saunders
Michael Puthawala
42
0
0
26 Mar 2025
Dolphin: A Large-Scale Automatic Speech Recognition Model for Eastern Languages
Yangyang Meng
Jinpeng Li
Guodong Lin
Yu Pu
G. Wang
Hu Du
Zhiming Shao
Yukai Huang
Ke Li
Wei-Qiang Zhang
ObjD
101
0
0
26 Mar 2025
Boosting the Transferability of Audio Adversarial Examples with Acoustic Representation Optimization
Weifei Jin
Junjie Su
Hejia Wang
Yulin Ye
Jie Hao
AAML
45
0
0
25 Mar 2025
RAG-based User Profiling for Precision Planning in Mixed-precision Over-the-Air Federated Learning
Jinsheng Yuan
Yun Tang
Weisi Guo
45
0
0
19 Mar 2025
MMS-LLaMA: Efficient LLM-based Audio-Visual Speech Recognition with Minimal Multimodal Speech Tokens
Jeong Hun Yeo
Hyeongseop Rha
Se Jin Park
Y. Ro
56
0
0
14 Mar 2025
Enhancing Aviation Communication Transcription: Fine-Tuning Distil-Whisper with LoRA
Shokoufeh Mirzaei
Jesse Arzate
Yukti Vijay
32
0
0
13 Mar 2025
ConjointNet: Enhancing Conjoint Analysis for Preference Prediction with Representation Learning
Yanxia Zhang
Francine Chen
Shabnam Hakimi
Totte Harinen
Alex Filipowicz
...
Nikos Aréchiga
Kalani Murakami
Kent Lyons
Charlene C. Wu
Matt Klenk
40
1
0
12 Mar 2025
Make Haste Slowly: A Theory of Emergent Structured Mixed Selectivity in Feature Learning ReLU Networks
Devon Jarvis
Richard Klein
Benjamin Rosman
Andrew M. Saxe
MLT
66
1
0
08 Mar 2025
RouterEval: A Comprehensive Benchmark for Routing LLMs to Explore Model-level Scaling Up in LLMs
Zhongzhan Huang
Guoming Ling
Vincent S. Liang
Yupei Lin
Yandong Chen
Shanshan Zhong
Hefeng Wu
LRM
54
2
0
08 Mar 2025
(Mis)Fitting: A Survey of Scaling Laws
Margaret Li
Sneha Kudugunta
Luke Zettlemoyer
69
2
0
26 Feb 2025
HadamRNN: Binary and Sparse Ternary Orthogonal RNNs
Armand Foucault
Franck Mamalet
François Malgouyres
MQ
85
0
0
28 Jan 2025
Dimensions underlying the representational alignment of deep neural networks with humans
F. Mahner
Lukas Muttenthaler
Umut Güçlü
M. Hebart
48
4
0
28 Jan 2025
Adapting Whisper for Regional Dialects: Enhancing Public Services for Vulnerable Populations in the United Kingdom
Melissa Torgbi
Andrew Clayman
Jordan J. Speight
Harish Tayyar Madabushi
31
0
0
15 Jan 2025
From Audio Deepfake Detection to AI-Generated Music Detection -- A Pathway and Overview
Yupei Li
M. Milling
Lucia Specia
Björn Schuller
89
6
0
30 Nov 2024
RELATE: A Modern Processing Platform for Romanian Language
V. Pais
Radu Ion
Andrei-Marius Avram
Maria Mitrofan
D. Tufis
VLM
24
0
0
29 Oct 2024
TabSeq: A Framework for Deep Learning on Tabular Data via Sequential Ordering
A. Habib
Kesheng Wang
Mary-Anne Hartley
Gianfranco Doretto
Donald Adjeroh
LMTD
37
1
0
17 Oct 2024
A two-stage transliteration approach to improve performance of a multilingual ASR
Rohit Kumar
18
0
0
09 Oct 2024
Convolutional Variational Autoencoders for Spectrogram Compression in Automatic Speech Recognition
Olga Iakovenko
Ivan Bondarenko
27
0
0
03 Oct 2024
Data Extrapolation for Text-to-image Generation on Small Datasets
Senmao Ye
Fei Liu
33
0
0
02 Oct 2024
WeHelp: A Shared Autonomy System for Wheelchair Users
Abulikemu Abuduweili
Alice Wu
Tianhao Wei
Weiye Zhao
43
0
0
18 Sep 2024
Open-World Test-Time Training: Self-Training with Contrast Learning
Houcheng Su
Mengzhu Wang
Jiao Li
Bingli Wang
Daixian Liu
Zeheng Wang
VLM
26
0
0
15 Sep 2024
ASR Error Correction using Large Language Models
Rao Ma
Mengjie Qian
Mark Gales
Kate Knill
KELM
46
1
0
14 Sep 2024
The State of Commercial Automatic French Legal Speech Recognition Systems and their Impact on Court Reporters et al
Nicolad Garneau
Olivier Bolduc
ELM
AILaw
55
0
0
21 Aug 2024
Digital Avatars: Framework Development and Their Evaluation
Timothy Rupprecht
Sung-En Chang
Yushu Wu
Lei Lu
Enfu Nan
...
Zhimin Li
Zhijun Hu
Yumei He
David Kaeli
Yanzhi Wang
31
0
0
07 Aug 2024
DeepSpeech models show Human-like Performance and Processing of Cochlear Implant Inputs
Cynthia R. Steinhardt
Menoua Keshishian
N. Mesgarani
Kim Stachenfeld
26
0
0
30 Jul 2024
Text-based Talking Video Editing with Cascaded Conditional Diffusion
Bo Han
Heqing Zou
Haoyang Li
Guangcong Wang
Chng Eng Siong
VGen
DiffM
37
2
0
20 Jul 2024
CBM: Curriculum by Masking
Andrei Jarca
Florinel-Alin Croitoru
Radu Tudor Ionescu
40
0
0
06 Jul 2024
Evaluating Model Performance Under Worst-case Subpopulations
Mike Li
Hongseok Namkoong
Shangzhou Xia
48
17
0
01 Jul 2024
Zero-Query Adversarial Attack on Black-box Automatic Speech Recognition Systems
Zheng Fang
Tao Wang
Lingchen Zhao
Shenyi Zhang
Bowen Li
Yunjie Ge
Q. Li
Chao Shen
Qian Wang
18
4
0
27 Jun 2024
Token-Weighted RNN-T for Learning from Flawed Data
Gil Keren
Wei Zhou
Ozlem Kalinli
43
0
0
26 Jun 2024
Decoder-only Architecture for Streaming End-to-end Speech Recognition
E. Tsunoo
Hayato Futami
Yosuke Kashiwagi
Siddhant Arora
Shinji Watanabe
RALM
AuLLM
36
6
0
23 Jun 2024
Boosting Consistency in Dual Training for Long-Tailed Semi-Supervised Learning
Kai Gan
Tong Wei
Min-Ling Zhang
40
1
0
19 Jun 2024
Communication-Efficient Distributed Deep Learning via Federated Dynamic Averaging
Michail Theologitis
Georgios Frangias
Georgios Anestis
V. Samoladas
Antonios Deligiannakis
FedML
40
0
0
31 May 2024
OpFlowTalker: Realistic and Natural Talking Face Generation via Optical Flow Guidance
Shuheng Ge
Haoyu Xing
Li Zhang
Xiangqian Wu
39
0
0
23 May 2024
Contribute to balance, wire in accordance: Emergence of backpropagation from a simple, bio-plausible neuroplasticity rule
Xinhao Fan
S. P. Mysore
37
0
0
23 May 2024
Towards Evaluating the Robustness of Automatic Speech Recognition Systems via Audio Style Transfer
Weifei Jin
Yuxin Cao
Junjie Su
Qi Shen
Kai Ye
Derui Wang
Jie Hao
Ziyao Liu
AAML
46
2
0
15 May 2024
Chaos-based reinforcement learning with TD3
Toshitaka Matsuki
Yusuke Sakemi
Kazuyuki Aihara
30
0
0
15 May 2024
Architecture of a Cortex Inspired Hierarchical Event Recaller
Valentín Puente Varona
14
1
0
03 May 2024
Sequence-to-sequence models in peer-to-peer learning: A practical application
Robert Šajina
Ivo Ipšić
46
0
0
02 May 2024
CM-TTS: Enhancing Real Time Text-to-Speech Synthesis Efficiency through Weighted Samplers and Consistency Models
Xiang Li
Fan Bu
Ambuj Mehrish
Yingting Li
Jiale Han
Bo Cheng
Soujanya Poria
DiffM
40
6
0
31 Mar 2024
VidLA: Video-Language Alignment at Scale
Mamshad Nayeem Rizve
Fan Fei
Jayakrishnan Unnikrishnan
Son Tran
Benjamin Z. Yao
Belinda Zeng
Mubarak Shah
Trishul Chilimbi
VLM
AI4TS
58
4
0
21 Mar 2024
Abstracting Sparse DNN Acceleration via Structured Sparse Tensor Decomposition
Geonhwa Jeong
Po-An Tsai
Abhimanyu Bambhaniya
S. Keckler
Tushar Krishna
33
7
0
12 Mar 2024
Speech Robust Bench: A Robustness Benchmark For Speech Recognition
Muhammad A. Shah
David Solans Noguero
Mikko A. Heikkilä
Nicolas Kourtellis
32
5
0
08 Mar 2024
TMT: Tri-Modal Translation between Speech, Image, and Text by Processing Different Modalities as Different Languages
Minsu Kim
Jee-weon Jung
Hyeongseop Rha
Soumi Maiti
Siddhant Arora
Xuankai Chang
Shinji Watanabe
Y. Ro
28
7
0
25 Feb 2024
1
2
3
4
...
17
18
19
Next