ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1412.5567
  4. Cited By
Deep Speech: Scaling up end-to-end speech recognition
v1v2 (latest)

Deep Speech: Scaling up end-to-end speech recognition

17 December 2014
Awni Y. Hannun
Carl Case
Jared Casper
Bryan Catanzaro
G. Diamos
Erich Elsen
R. Prenger
S. Satheesh
Shubho Sengupta
Adam Coates
A. Ng
ArXiv (abs)PDFHTML

Papers citing "Deep Speech: Scaling up end-to-end speech recognition"

50 / 768 papers shown
Title
Imitator: Personalized Speech-driven 3D Facial Animation
Imitator: Personalized Speech-driven 3D Facial AnimationIEEE International Conference on Computer Vision (ICCV), 2022
Balamurugan Thambiraja
I. Habibie
S. Aliakbarian
Darren Cosker
Christian Theobalt
Justus Thies
CVBM
240
88
0
30 Dec 2022
End-to-End Automatic Speech Recognition model for the Sudanese Dialect
End-to-End Automatic Speech Recognition model for the Sudanese Dialect
Ayman Mansour
Wafaa F. Mukhtar
72
1
0
21 Dec 2022
KL Regularized Normalization Framework for Low Resource Tasks
KL Regularized Normalization Framework for Low Resource Tasks
Neeraj Kumar
Ankur Narang
Brejesh Lall
127
1
0
21 Dec 2022
VSVC: Backdoor attack against Keyword Spotting based on Voiceprint
  Selection and Voice Conversion
VSVC: Backdoor attack against Keyword Spotting based on Voiceprint Selection and Voice Conversion
Hanbo Cai
Pengcheng Zhang
Hai Dong
Yan Xiao
Shunhui Ji
141
7
0
20 Dec 2022
A Review of Speech-centric Trustworthy Machine Learning: Privacy,
  Safety, and Fairness
A Review of Speech-centric Trustworthy Machine Learning: Privacy, Safety, and FairnessAPSIPA Transactions on Signal and Information Processing (TASIP), 2022
Tiantian Feng
Rajat Hebbar
Nicholas Mehlman
Xuan Shi
Aditya Kommineni
and Shrikanth Narayanan
228
37
0
18 Dec 2022
An Exploratory Study of AI System Risk Assessment from the Lens of Data
  Distribution and Uncertainty
An Exploratory Study of AI System Risk Assessment from the Lens of Data Distribution and Uncertainty
Zhijie Wang
Yuheng Huang
Lei Ma
Haruki Yokoyama
Susumu Tokumoto
Kazuki Munakata
198
6
0
13 Dec 2022
Estimator: An Effective and Scalable Framework for Transportation Mode
  Classification over Trajectories
Estimator: An Effective and Scalable Framework for Transportation Mode Classification over Trajectories
Danlei Hu
Ziquan Fang
Hanxi Fang
Tianyi Li
Chun-ru Shen
Lu Chen
Yunjun Gao
152
9
0
11 Dec 2022
Memories are One-to-Many Mapping Alleviators in Talking Face Generation
Memories are One-to-Many Mapping Alleviators in Talking Face GenerationIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022
Anni Tang
Tianyu He
Xuejiao Tan
Jun Ling
Liang Song
CVBM
305
27
0
09 Dec 2022
Thales: Formulating and Estimating Architectural Vulnerability Factors
  for DNN Accelerators
Thales: Formulating and Estimating Architectural Vulnerability Factors for DNN Accelerators
Abhishek Tyagi
Yiming Gan
Shaoshan Liu
Bo Yu
P. Whatmough
Yuhao Zhu
AAML
241
12
0
05 Dec 2022
PiPar: Pipeline Parallelism for Collaborative Machine Learning
PiPar: Pipeline Parallelism for Collaborative Machine Learning
Zihan Zhang
Philip Rodgers
Peter Kilpatrick
I. Spence
Blesson Varghese
FedML
259
6
0
01 Dec 2022
Evaluating and reducing the distance between synthetic and real speech
  distributions
Evaluating and reducing the distance between synthetic and real speech distributionsInterspeech (Interspeech), 2022
Christoph Minixhofer
Ondˇrej Klejch
P. Bell
213
9
0
29 Nov 2022
Deep representation learning: Fundamentals, Perspectives, Applications,
  and Open Challenges
Deep representation learning: Fundamentals, Perspectives, Applications, and Open Challenges
K. T. Baghaei
Amirreza Payandeh
Pooya Fayyazsanavi
Shahram Rahimi
Zhiqian Chen
Somayeh Bakhtiari Ramezani
FaMLAI4TS
211
10
0
27 Nov 2022
Dynamic Neural Portraits
Dynamic Neural PortraitsIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2022
M. Doukas
Stylianos Ploumpis
Stefanos Zafeiriou
3DH
126
1
0
25 Nov 2022
HARL: Hierarchical Adaptive Reinforcement Learning Based Auto Scheduler
  for Neural Networks
HARL: Hierarchical Adaptive Reinforcement Learning Based Auto Scheduler for Neural NetworksInternational Conference on Parallel Processing (ICPP), 2022
Zining Zhang
Bingsheng He
Zhenjie Zhang
106
6
0
21 Nov 2022
Phonemic Adversarial Attack against Audio Recognition in Real World
Phonemic Adversarial Attack against Audio Recognition in Real World
Jinyang Guo
Zhendong Chen
Zixin Yin
Qinghong Yang
Xianglong Liu
AAML
124
5
0
19 Nov 2022
VeLO: Training Versatile Learned Optimizers by Scaling Up
VeLO: Training Versatile Learned Optimizers by Scaling Up
Luke Metz
James Harrison
C. Freeman
Amil Merchant
Lucas Beyer
...
Naman Agrawal
Ben Poole
Igor Mordatch
Adam Roberts
Jascha Narain Sohl-Dickstein
282
75
0
17 Nov 2022
Hey ASR System! Why Aren't You More Inclusive? Automatic Speech
  Recognition Systems' Bias and Proposed Bias Mitigation Techniques. A
  Literature Review
Hey ASR System! Why Aren't You More Inclusive? Automatic Speech Recognition Systems' Bias and Proposed Bias Mitigation Techniques. A Literature ReviewInteracción (IN), 2022
Mikel K. Ngueajio
Gloria J. Washington
195
42
0
17 Nov 2022
Improving Children's Speech Recognition by Fine-tuning Self-supervised
  Adult Speech Representations
Improving Children's Speech Recognition by Fine-tuning Self-supervised Adult Speech Representations
Renée Lu
M. Shahin
Beena Ahmed
177
7
0
14 Nov 2022
FullPack: Full Vector Utilization for Sub-Byte Quantized Inference on
  General Purpose CPUs
FullPack: Full Vector Utilization for Sub-Byte Quantized Inference on General Purpose CPUs
Hossein Katebi
Navidreza Asadi
M. Goudarzi
MQ
130
1
0
13 Nov 2022
MSDT: Masked Language Model Scoring Defense in Text Domain
MSDT: Masked Language Model Scoring Defense in Text DomainInternational Conference on Universal Village (ICUV), 2022
Jaechul Roh
Minhao Cheng
Yajun Fang
AAML
86
1
0
10 Nov 2022
Peak-First CTC: Reducing the Peak Latency of CTC Models by Applying
  Peak-First Regularization
Peak-First CTC: Reducing the Peak Latency of CTC Models by Applying Peak-First RegularizationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Zhengkun Tian
Hongyu Xiang
Min Li
Fei Lin
Ke Ding
Guanglu Wan
116
7
0
07 Nov 2022
Data-free Defense of Black Box Models Against Adversarial Attacks
Data-free Defense of Black Box Models Against Adversarial Attacks
Gaurav Kumar Nayak
Inder Khatri
Ruchit Rawal
Anirban Chakraborty
AAML
161
2
0
03 Nov 2022
Losses Can Be Blessings: Routing Self-Supervised Speech Representations Towards Efficient Multilingual and Multitask Speech Processing
Losses Can Be Blessings: Routing Self-Supervised Speech Representations Towards Efficient Multilingual and Multitask Speech ProcessingNeural Information Processing Systems (NeurIPS), 2022
Yonggan Fu
Yang Zhang
Kaizhi Qian
Zhifan Ye
Zhongzhi Yu
Cheng-I Jeff Lai
Yingyan Lin
316
10
0
02 Nov 2022
Modular Hybrid Autoregressive Transducer
Modular Hybrid Autoregressive TransducerSpoken Language Technology Workshop (SLT), 2022
Zhong Meng
Tongzhou Chen
Rohit Prabhavalkar
Yu Zhang
Gary Wang
...
Bhuvana Ramabhadran
Wenjie Huang
Ehsan Variani
Yinghui Huang
Pedro J. Moreno
173
27
0
31 Oct 2022
BERT Meets CTC: New Formulation of End-to-End Speech Recognition with
  Pre-trained Masked Language Model
BERT Meets CTC: New Formulation of End-to-End Speech Recognition with Pre-trained Masked Language ModelConference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Yosuke Higuchi
Brian Yan
Siddhant Arora
Tetsuji Ogawa
Tetsunori Kobayashi
Shinji Watanabe
238
31
0
29 Oct 2022
ESB: A Benchmark For Multi-Domain End-to-End Speech Recognition
ESB: A Benchmark For Multi-Domain End-to-End Speech Recognition
Sanchit Gandhi
Patrick von Platen
Alexander M. Rush
133
27
0
24 Oct 2022
10 hours data is all you need
10 hours data is all you need
Zeping Min
Qian Ge
Zhong Li
165
3
0
24 Oct 2022
Improving Semi-supervised End-to-end Automatic Speech Recognition using
  CycleGAN and Inter-domain Losses
Improving Semi-supervised End-to-end Automatic Speech Recognition using CycleGAN and Inter-domain LossesSpoken Language Technology Workshop (SLT), 2022
C. Li
Ngoc Thang Vu
125
3
0
20 Oct 2022
Accelerating Transfer Learning with Near-Data Computation on Cloud
  Object Stores
Accelerating Transfer Learning with Near-Data Computation on Cloud Object StoresACM Symposium on Cloud Computing (SoCC), 2022
Arsany Guirguis
Diana Petrescu
Florin Dinu
D. Quoc
Javier Picorel
R. Guerraoui
206
0
0
16 Oct 2022
Deep learning model compression using network sensitivity and gradients
Deep learning model compression using network sensitivity and gradients
M. Sakthi
N. Yadla
Raj Pawate
150
2
0
11 Oct 2022
JoeyS2T: Minimalistic Speech-to-Text Modeling with JoeyNMT
JoeyS2T: Minimalistic Speech-to-Text Modeling with JoeyNMTConference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Mayumi Ohta
Julia Kreutzer
Stefan Riezler
151
0
0
05 Oct 2022
E-Branchformer: Branchformer with Enhanced merging for speech
  recognition
E-Branchformer: Branchformer with Enhanced merging for speech recognitionSpoken Language Technology Workshop (SLT), 2022
Kwangyoun Kim
Felix Wu
Yifan Peng
Jing Pan
Prashant Sridhar
Kyu Jeong Han
Shinji Watanabe
367
156
0
30 Sep 2022
ZeroEGGS: Zero-shot Example-based Gesture Generation from Speech
ZeroEGGS: Zero-shot Example-based Gesture Generation from Speech
Saeed Ghorbani
Ylva Ferstl
Daniel Holden
N. Troje
M. Carbonneau
264
117
0
15 Sep 2022
Deep Speech Synthesis from Articulatory Representations
Deep Speech Synthesis from Articulatory RepresentationsInterspeech (Interspeech), 2022
Peter Wu
Shinji Watanabe
Louis Goldstein
A. Black
Gopala K. Anumanchipalli
174
34
0
13 Sep 2022
Synthesizing Photorealistic Virtual Humans Through Cross-modal
  Disentanglement
Synthesizing Photorealistic Virtual Humans Through Cross-modal DisentanglementComputer Vision and Pattern Recognition (CVPR), 2022
S. Ravichandran
Ondrej Texler
Dimitar Dinev
Hyun Jae Kang
137
4
0
03 Sep 2022
Universal Fourier Attack for Time Series
Universal Fourier Attack for Time SeriesIEEE Open Journal of Signal Processing (JOSP), 2022
Elizabeth Coda
B. Clymer
Chance N. DeSmet
Y. Watkins
Michael Girard
165
1
0
02 Sep 2022
RL-DistPrivacy: Privacy-Aware Distributed Deep Inference for low latency
  IoT systems
RL-DistPrivacy: Privacy-Aware Distributed Deep Inference for low latency IoT systemsIEEE Transactions on Network Science and Engineering (IEEE T-NSE), 2022
Emna Baccour
A. Erbad
Amr M. Mohamed
Mounir Hamdi
Mohsen Guizani
111
16
0
27 Aug 2022
Not All GPUs Are Created Equal: Characterizing Variability in
  Large-Scale, Accelerator-Rich Systems
Not All GPUs Are Created Equal: Characterizing Variability in Large-Scale, Accelerator-Rich SystemsInternational Conference for High Performance Computing, Networking, Storage and Analysis (SC), 2022
Prasoon Sinha
Akhil Guliani
Rutwik Jain
Brandon Tran
Matthew D. Sinclair
Shivaram Venkataraman
181
29
0
23 Aug 2022
How does the degree of novelty impacts semi-supervised representation
  learning for novel class retrieval?
How does the degree of novelty impacts semi-supervised representation learning for novel class retrieval?
Q. Leroy
Olivier Buisson
Alexis Joly
SSL
75
0
0
17 Aug 2022
Unifying Gradients to Improve Real-world Robustness for Deep Networks
Unifying Gradients to Improve Real-world Robustness for Deep NetworksACM Transactions on Intelligent Systems and Technology (ACM TIST), 2022
Yingwen Wu
Sizhe Chen
Kun Fang
Xiaolin Huang
AAML
191
4
0
12 Aug 2022
Zeus: Understanding and Optimizing GPU Energy Consumption of DNN
  Training
Zeus: Understanding and Optimizing GPU Energy Consumption of DNN TrainingSymposium on Networked Systems Design and Implementation (NSDI), 2022
Jie You
Jaehoon Chung
Mosharaf Chowdhury
280
119
0
12 Aug 2022
Pronunciation-aware unique character encoding for RNN Transducer-based
  Mandarin speech recognition
Pronunciation-aware unique character encoding for RNN Transducer-based Mandarin speech recognitionSpoken Language Technology Workshop (SLT), 2022
Peng Shen
Xugang Lu
Hisashi Kawai
105
2
0
29 Jul 2022
Learning Dynamic Facial Radiance Fields for Few-Shot Talking Head
  Synthesis
Learning Dynamic Facial Radiance Fields for Few-Shot Talking Head SynthesisEuropean Conference on Computer Vision (ECCV), 2022
Shuai Shen
Wanhua Li
Zhengbiao Zhu
Yueqi Duan
Jie Zhou
Jiwen Lu
CVBM
161
130
0
24 Jul 2022
Improving spatial cues for hearables using a parameterized binaural CDR
  estimator
Improving spatial cues for hearables using a parameterized binaural CDR estimator
Reza Ghanavi
C. Jin
60
1
0
17 Jul 2022
End-to-End Spoken Language Understanding: Performance analyses of a
  voice command task in a low resource setting
End-to-End Spoken Language Understanding: Performance analyses of a voice command task in a low resource settingComputer Speech and Language (CSL), 2022
Thierry Desot
François Portet
Michel Vacher
103
15
0
17 Jul 2022
pMCT: Patched Multi-Condition Training for Robust Speech Recognition
pMCT: Patched Multi-Condition Training for Robust Speech RecognitionInterspeech (Interspeech), 2022
Pablo Peso Parada
A. Dobrowolska
Karthikeyan P. Saravanan
Mete Ozay
224
11
0
11 Jul 2022
Adversarial Ensemble Training by Jointly Learning Label Dependencies and
  Member Models
Adversarial Ensemble Training by Jointly Learning Label Dependencies and Member ModelsInternational Conference on Intelligent Computing (ICIC), 2022
Lele Wang
B. Liu
UQCV
315
7
0
29 Jun 2022
The Makerere Radio Speech Corpus: A Luganda Radio Corpus for Automatic
  Speech Recognition
The Makerere Radio Speech Corpus: A Luganda Radio Corpus for Automatic Speech RecognitionInternational Conference on Language Resources and Evaluation (LREC), 2022
Jonathan Mukiibi
Andrew Katumba
J. Nakatumba‐Nabende
Ali Hussein
Josh Meyer
201
7
0
20 Jun 2022
Residual Language Model for End-to-end Speech Recognition
Residual Language Model for End-to-end Speech RecognitionInterspeech (Interspeech), 2022
E. Tsunoo
Yosuke Kashiwagi
Chaitanya Narisetty
Shinji Watanabe
131
11
0
15 Jun 2022
Local Identifiability of Deep ReLU Neural Networks: the Theory
Local Identifiability of Deep ReLU Neural Networks: the TheoryNeural Information Processing Systems (NeurIPS), 2022
Joachim Bona-Pellissier
Franccois Malgouyres
François Bachoc
FAtt
292
11
0
15 Jun 2022
Previous
12345...141516
Next