ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1512.02595
  4. Cited By
Deep Speech 2: End-to-End Speech Recognition in English and Mandarin

Deep Speech 2: End-to-End Speech Recognition in English and Mandarin

8 December 2015
Dario Amodei
Rishita Anubhai
Eric Battenberg
Carl Case
Jared Casper
Bryan Catanzaro
Jingdong Chen
Mike Chrzanowski
Adam Coates
G. Diamos
Erich Elsen
Jesse Engel
Linxi Fan
Christopher Fougner
T. Han
Awni Y. Hannun
Billy Jun
P. LeGresley
Libby Lin
Sharan Narang
A. Ng
Sherjil Ozair
R. Prenger
Jonathan Raiman
S. Satheesh
David Seetapun
Shubho Sengupta
Yi Wang
Zhiqian Wang
Chong-Jun Wang
Bo Xiao
Dani Yogatama
J. Zhan
Zhenyao Zhu
ArXivPDFHTML

Papers citing "Deep Speech 2: End-to-End Speech Recognition in English and Mandarin"

50 / 936 papers shown
Title
Transfer Learning for Robust Low-Resource Children's Speech ASR with
  Transformers and Source-Filter Warping
Transfer Learning for Robust Low-Resource Children's Speech ASR with Transformers and Source-Filter Warping
Jenthe Thienpondt
Kris Demuynck
17
11
0
19 Jun 2022
Residual Language Model for End-to-end Speech Recognition
Residual Language Model for End-to-end Speech Recognition
E. Tsunoo
Yosuke Kashiwagi
Chaitanya Narisetty
Shinji Watanabe
24
11
0
15 Jun 2022
LADDER: Latent Boundary-guided Adversarial Training
LADDER: Latent Boundary-guided Adversarial Training
Xiaowei Zhou
Ivor W. Tsang
Jie Yin
AAML
25
6
0
08 Jun 2022
LegoNN: Building Modular Encoder-Decoder Models
LegoNN: Building Modular Encoder-Decoder Models
Siddharth Dalmia
Dmytro Okhonko
M. Lewis
Sergey Edunov
Shinji Watanabe
Florian Metze
Luke Zettlemoyer
Abdel-rahman Mohamed
AuLLM
MoE
29
14
0
07 Jun 2022
Predicting and Understanding Human Action Decisions during Skillful
  Joint-Action via Machine Learning and Explainable-AI
Predicting and Understanding Human Action Decisions during Skillful Joint-Action via Machine Learning and Explainable-AI
Fabrizia Auletta
Rachel W. Kallen
M. D. Bernardo
Micheal J. Richardson
6
2
0
06 Jun 2022
Lip-Listening: Mixing Senses to Understand Lips using Cross Modality
  Knowledge Distillation for Word-Based Models
Lip-Listening: Mixing Senses to Understand Lips using Cross Modality Knowledge Distillation for Word-Based Models
Hadeel Mabrouk
Omar Abugabal
Nourhan Sakr
Hesham M. Eraqi
VLM
33
2
0
05 Jun 2022
Toward a realistic model of speech processing in the brain with
  self-supervised learning
Toward a realistic model of speech processing in the brain with self-supervised learning
Juliette Millet
Charlotte Caucheteux
Pierre Orhan
Yves Boubenec
Alexandre Gramfort
Ewan Dunbar
Christophe Pallier
J. King
33
92
0
03 Jun 2022
Deep neural networks can stably solve high-dimensional, noisy,
  non-linear inverse problems
Deep neural networks can stably solve high-dimensional, noisy, non-linear inverse problems
Andrés Felipe Lerma Pineda
P. Petersen
31
5
0
02 Jun 2022
FELARE: Fair Scheduling of Machine Learning Tasks on Heterogeneous Edge
  Systems
FELARE: Fair Scheduling of Machine Learning Tasks on Heterogeneous Edge Systems
Ali Mokhtari
Md. Abir Hossen
Pooyan Jamshidi
M. Salehi
29
9
0
31 May 2022
Do self-supervised speech models develop human-like perception biases?
Do self-supervised speech models develop human-like perception biases?
Juliette Millet
Ewan Dunbar
SSL
24
20
0
31 May 2022
Adaptive Activation Network For Low Resource Multilingual Speech
  Recognition
Adaptive Activation Network For Low Resource Multilingual Speech Recognition
Jian Luo
Jianzong Wang
Ning Cheng
Zhenpeng Zheng
Jing Xiao
30
0
0
28 May 2022
Self-supervised models of audio effectively explain human cortical
  responses to speech
Self-supervised models of audio effectively explain human cortical responses to speech
Aditya R. Vaidya
Shailee Jain
Alexander G. Huth
33
42
0
27 May 2022
Transfer and Share: Semi-Supervised Learning from Long-Tailed Data
Transfer and Share: Semi-Supervised Learning from Long-Tailed Data
Tong Wei
Qianqian Liu
Jiang-Xin Shi
Wei-Wei Tu
Lan-Zhe Guo
31
14
0
26 May 2022
Joint Training of Speech Enhancement and Self-supervised Model for
  Noise-robust ASR
Joint Training of Speech Enhancement and Self-supervised Model for Noise-robust ASR
Qiu-shi Zhu
Jie Zhang
Zitian Zhang
Lirong Dai
43
15
0
26 May 2022
PaddleSpeech: An Easy-to-Use All-in-One Speech Toolkit
PaddleSpeech: An Easy-to-Use All-in-One Speech Toolkit
Hui Zhang
Tian Yuan
Junkun Chen
Xintong Li
Renjie Zheng
...
Zeyu Chen
Xiaoguang Hu
Dianhai Yu
Yanjun Ma
Liang Huang
AuLLM
36
24
0
20 May 2022
Insights on Neural Representations for End-to-End Speech Recognition
Insights on Neural Representations for End-to-End Speech Recognition
A. Ollerenshaw
Md. Asif Jalal
Thomas Hain
18
7
0
19 May 2022
Learning Rate Curriculum
Learning Rate Curriculum
Florinel-Alin Croitoru
Nicolae-Cătălin Ristea
Radu Tudor Ionescu
N. Sebe
19
9
0
18 May 2022
Deep Learning Enabled Semantic Communications with Speech Recognition
  and Synthesis
Deep Learning Enabled Semantic Communications with Speech Recognition and Synthesis
Zhenzi Weng
Zhijin Qin
Xiaoming Tao
Chengkang Pan
Guangyi Liu
Geoffrey Ye Li
41
132
0
09 May 2022
A Highly Adaptive Acoustic Model for Accurate Multi-Dialect Speech
  Recognition
A Highly Adaptive Acoustic Model for Accurate Multi-Dialect Speech Recognition
Sanghyun Yoo
Inchul Song
Yoshua Bengio
22
28
0
06 May 2022
Non-Autoregressive Machine Translation: It's Not as Fast as it Seems
Non-Autoregressive Machine Translation: It's Not as Fast as it Seems
Jindvrich Helcl
Barry Haddow
Alexandra Birch
27
20
0
04 May 2022
SMLT: A Serverless Framework for Scalable and Adaptive Machine Learning
  Design and Training
SMLT: A Serverless Framework for Scalable and Adaptive Machine Learning Design and Training
Ahsan Ali
Syed Zawad
Paarijaat Aditya
Istemi Ekin Akkus
Ruichuan Chen
Feng Yan
34
9
0
04 May 2022
Wav2Seq: Pre-training Speech-to-Text Encoder-Decoder Models Using Pseudo
  Languages
Wav2Seq: Pre-training Speech-to-Text Encoder-Decoder Models Using Pseudo Languages
Felix Wu
Kwangyoun Kim
Shinji Watanabe
Kyu Jeong Han
Ryan T. McDonald
Kilian Q. Weinberger
Yoav Artzi
SyDa
48
37
0
02 May 2022
DDDM: a Brain-Inspired Framework for Robust Classification
DDDM: a Brain-Inspired Framework for Robust Classification
Xiyuan Chen
Xingyu Li
Yi Zhou
Tianming Yang
AAML
DiffM
43
7
0
01 May 2022
Named Entity Recognition for Audio De-Identification
Named Entity Recognition for Audio De-Identification
Guillaume Baril
P. Cardinal
Alessandro Lameiras Koerich
24
3
0
26 Apr 2022
CellDefectNet: A Machine-designed Attention Condenser Network for
  Electroluminescence-based Photovoltaic Cell Defect Inspection
CellDefectNet: A Machine-designed Attention Condenser Network for Electroluminescence-based Photovoltaic Cell Defect Inspection
Carol Xu
M. Famouri
Gautam Bathla
Saeejith Nair
M. Shafiee
Alexander Wong
19
4
0
25 Apr 2022
Distributionally Robust Models with Parametric Likelihood Ratios
Distributionally Robust Models with Parametric Likelihood Ratios
Paul Michel
Tatsunori Hashimoto
Graham Neubig
OOD
30
15
0
13 Apr 2022
A Wav2vec2-Based Experimental Study on Self-Supervised Learning Methods
  to Improve Child Speech Recognition
A Wav2vec2-Based Experimental Study on Self-Supervised Learning Methods to Improve Child Speech Recognition
Rishabh Jain
Andrei Barcovschi
Mariam Yiwere
Dan Bigioi
Peter Corcoran
H. Cucu
28
31
0
06 Apr 2022
Successes and critical failures of neural networks in capturing
  human-like speech recognition
Successes and critical failures of neural networks in capturing human-like speech recognition
Federico Adolfi
J. Bowers
David Poeppel
UQCV
30
19
0
06 Apr 2022
Towards End-to-end Unsupervised Speech Recognition
Towards End-to-end Unsupervised Speech Recognition
Alexander H. Liu
Wei-Ning Hsu
Michael Auli
Alexei Baevski
SSL
31
74
0
05 Apr 2022
Lip to Speech Synthesis with Visual Context Attentional GAN
Lip to Speech Synthesis with Visual Context Attentional GAN
Minsu Kim
Joanna Hong
Y. Ro
31
51
0
04 Apr 2022
Class-Incremental Learning by Knowledge Distillation with Adaptive
  Feature Consolidation
Class-Incremental Learning by Knowledge Distillation with Adaptive Feature Consolidation
Minsoo Kang
Jaeyoo Park
Bohyung Han
CLL
27
179
0
02 Apr 2022
Text-To-Speech Data Augmentation for Low Resource Speech Recognition
Text-To-Speech Data Augmentation for Low Resource Speech Recognition
Rodolfo Zevallos
21
4
0
01 Apr 2022
Better Intermediates Improve CTC Inference
Better Intermediates Improve CTC Inference
Tatsuya Komatsu
Yusuke Fujita
Jaesong Lee
Lukas Lee
Shinji Watanabe
Yusuke Kida
17
5
0
01 Apr 2022
PADA: Pruning Assisted Domain Adaptation for Self-Supervised Speech
  Representations
PADA: Pruning Assisted Domain Adaptation for Self-Supervised Speech Representations
L. D. Prasad
Sreyan Ghosh
S. Umesh
25
12
0
31 Mar 2022
Memory-Efficient Training of RNN-Transducer with Sampled Softmax
Memory-Efficient Training of RNN-Transducer with Sampled Softmax
Jaesong Lee
Lukas Lee
Shinji Watanabe
30
8
0
31 Mar 2022
Recent improvements of ASR models in the face of adversarial attacks
Recent improvements of ASR models in the face of adversarial attacks
R. Olivier
Bhiksha Raj
AAML
24
13
0
29 Mar 2022
Streaming parallel transducer beam search with fast-slow cascaded
  encoders
Streaming parallel transducer beam search with fast-slow cascaded encoders
Jay Mahadeokar
Yangyang Shi
Ke Li
Duc Le
Jiedan Zhu
Vikas Chandra
Ozlem Kalinli
M. Seltzer
35
15
0
29 Mar 2022
WeNet 2.0: More Productive End-to-End Speech Recognition Toolkit
WeNet 2.0: More Productive End-to-End Speech Recognition Toolkit
Binbin Zhang
Di Wu
Zhendong Peng
Xingcheng Song
Zhuoyuan Yao
Hang Lv
Linfu Xie
Chao Yang
Fuping Pan
Jianwei Niu
VLM
29
94
0
29 Mar 2022
Mel Frequency Spectral Domain Defenses against Adversarial Attacks on
  Speech Recognition Systems
Mel Frequency Spectral Domain Defenses against Adversarial Attacks on Speech Recognition Systems
Nicholas Mehlman
Anirudh Sreeram
Raghuveer Peri
Shrikanth Narayanan
AAML
29
4
0
29 Mar 2022
Training speaker recognition systems with limited data
Training speaker recognition systems with limited data
Nik Vaessen
David A. van Leeuwen
19
6
0
28 Mar 2022
Impact of Dataset on Acoustic Models for Automatic Speech Recognition
Impact of Dataset on Acoustic Models for Automatic Speech Recognition
S. Singh
11
0
0
25 Mar 2022
Neural Predictor for Black-Box Adversarial Attacks on Speech Recognition
Neural Predictor for Black-Box Adversarial Attacks on Speech Recognition
Marie Biolková
Bac Nguyen
AAML
33
2
0
18 Mar 2022
Self-Normalized Density Map (SNDM) for Counting Microbiological Objects
Self-Normalized Density Map (SNDM) for Counting Microbiological Objects
K. Graczyk
J. Pawlowski
Sylwia Majchrowska
Tomasz Golan
28
9
0
15 Mar 2022
DialogueNeRF: Towards Realistic Avatar Face-to-Face Conversation Video
  Generation
DialogueNeRF: Towards Realistic Avatar Face-to-Face Conversation Video Generation
Yichao Yan
Zanwei Zhou
Zi Wang
Chen-Ning Yang
Xiaokang Yang
CVBM
21
19
0
15 Mar 2022
Magnetic Field Prediction Using Generative Adversarial Networks
Magnetic Field Prediction Using Generative Adversarial Networks
Stefan Pollok
Nataniel Olden-Jorgensen
P. S. Jørgensen
Rasmus Bjørk
GAN
AI4CE
29
15
0
14 Mar 2022
Attacks as Defenses: Designing Robust Audio CAPTCHAs Using Attacks on
  Automatic Speech Recognition Systems
Attacks as Defenses: Designing Robust Audio CAPTCHAs Using Attacks on Automatic Speech Recognition Systems
H. Abdullah
Aditya Karlekar
S. Prasad
Muhammad Sajidur Rahman
Logan Blue
L. A. Bauer
Vincent Bindschaedler
Patrick Traynor
AAML
29
3
0
10 Mar 2022
Shfl-BW: Accelerating Deep Neural Network Inference with Tensor-Core
  Aware Weight Pruning
Shfl-BW: Accelerating Deep Neural Network Inference with Tensor-Core Aware Weight Pruning
Guyue Huang
Haoran Li
Minghai Qin
Fei Sun
Yufei Din
Yuan Xie
32
18
0
09 Mar 2022
Towards Generalized Models for Task-oriented Dialogue Modeling on Spoken
  Conversations
Towards Generalized Models for Task-oriented Dialogue Modeling on Spoken Conversations
Ruijie Yan
Shuang Peng
Haitao Mi
Liang Jiang
Shihui Yang
Yuchi Zhang
Jiajun Li
Liangrui Peng
Yongliang Wang
Zujie Wen
20
4
0
08 Mar 2022
GaitEdge: Beyond Plain End-to-end Gait Recognition for Better
  Practicality
GaitEdge: Beyond Plain End-to-end Gait Recognition for Better Practicality
Junhao Liang
Chao Fan
Saihui Hou
Chuanfu Shen
Yongzhen Huang
Shiqi Yu
CVBM
17
71
0
08 Mar 2022
aaeCAPTCHA: The Design and Implementation of Audio Adversarial CAPTCHA
aaeCAPTCHA: The Design and Implementation of Audio Adversarial CAPTCHA
Md. Imran Hossen
X. Hei
31
4
0
05 Mar 2022
Previous
123456...171819
Next