v1v2 (latest)

Deep Speech: Scaling up end-to-end speech recognition

17 December 2014

Papers citing "Deep Speech: Scaling up end-to-end speech recognition"

50 / 768 papers shown

AS2T: Arbitrary Source-To-Target Adversarial Attack on Speaker Recognition SystemsIEEE Transactions on Dependable and Secure Computing (TDSC), 2022

Lingling Fan

178

07 Jun 2022

LegoNN: Building Modular Encoder-Decoder ModelsIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2022

Sergey Edunov

Luke Zettlemoyer

176

07 Jun 2022

Speech Augmentation Based Unsupervised Learning for Keyword SpottingIEEE International Joint Conference on Neural Network (IJCNN), 2022

174

28 May 2022

Improving CTC-based ASR Models with Gated Interlayer CollaborationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

Yuting Yang

Yuke Li

Binbin Du

318

25 May 2022

Deep Learning for Visual Speech Analysis: A SurveyIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022

314

22 May 2022

Cardinality-Minimal Explanations for Monotonic Neural NetworksInternational Joint Conference on Artificial Intelligence (IJCAI), 2022

279

19 May 2022

Emotion-Controllable Generalized Talking Face GenerationInternational Joint Conference on Artificial Intelligence (IJCAI), 2022

149

02 May 2022

A Novel Speech-Driven Lip-Sync Model with CNN and LSTM

Xiaohong Li

Xiang Wang

Kai Wang

129

02 May 2022

Extricating IoT Devices from Vendor Infrastructure with Karl

Gina Yuan

David Mazières

Matei A. Zaharia

184

28 Apr 2022

Improving Self-Supervised Learning-based MOS Prediction Networks

Bálint Gyires-Tóth

Csaba Zainkó

SSL

110

23 Apr 2022

Adversarial Scratches: Deployable Attacks to CNN ClassifiersPattern Recognition (Pattern Recogn.), 2022

Gabriela F. Cretu-Ciocarlie

Briland Hitaj

Giacomo Boracchi

AAML

234

20 Apr 2022

STRATA: Word Boundaries & Phoneme Recognition From Continuous Urdu Speech using Transfer Learning, Attention, & Data Augmentation

Saad Naeem

Omer Beg

16 Apr 2022

A Unified Cascaded Encoder ASR Model for Dynamic Model SizesInterspeech (Interspeech), 2022

Ding Zhao

...

118

13 Apr 2022

A Wav2vec2-Based Experimental Study on Self-Supervised Learning Methods to Improve Child Speech RecognitionIEEE Access (IEEE Access), 2022

212

06 Apr 2022

Successes and critical failures of neural networks in capturing human-like speech recognitionNeural Networks (NN), 2022

276

06 Apr 2022

Lip to Speech Synthesis with Visual Context Attentional GANNeural Information Processing Systems (NeurIPS), 2022

Minsu Kim

Joanna Hong

Y. Ro

225

04 Apr 2022

Deep Speech Based End-to-End Automated Speech Recognition (ASR) for Indian-English Accents

P. Dubey

B. Shah

03 Apr 2022

Multi-task RNN-T with Semantic Decoder for Streamable Spoken Language UnderstandingIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

Kanthashree Mysore Sathyendra

131

01 Apr 2022

End-to-End Integration of Speech Recognition, Speech Enhancement, and Self-Supervised Learning RepresentationInterspeech (Interspeech), 2022

266

01 Apr 2022

Text-To-Speech Data Augmentation for Low Resource Speech Recognition

Rodolfo Zevallos

130

01 Apr 2022

Memory-Efficient Training of RNN-Transducer with Sampled SoftmaxInterspeech (Interspeech), 2022

Jaesong Lee

Lukas Lee

Shinji Watanabe

285

31 Mar 2022

An Empirical Study of Language Model Integration for Transducer based Speech RecognitionInterspeech (Interspeech), 2022

Zhijian Ou

194

31 Mar 2022

Is Word Error Rate a good evaluation metric for Speech Recognition in Indic Languages?

170

30 Mar 2022

Improving Speech Recognition for Indic Languages using Language Model

120

30 Mar 2022

4-bit Conformer with Native Quantization Aware Training for Speech RecognitionInterspeech (Interspeech), 2022

375

29 Mar 2022

Noise-robust Speech Recognition with 10 Minutes Unparalleled In-domain DataIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

Chen Chen

Yuchen Hu

201

29 Mar 2022

WaveFuzz: A Clean-Label Poisoning Attack to Protect Your Voice

Chao Shen

224

25 Mar 2022

Learning by non-interfering feedback chemical signaling in physical networksPhysical Review Research (Phys. Rev. Res.), 2022

Vidyesh Rao Anisetti

B. Scellier

J. M. Schwarz

127

22 Mar 2022

Neural Predictor for Black-Box Adversarial Attacks on Speech Recognition

Marie Biolková

Bac Nguyen

AAML

157

18 Mar 2022

Generalized but not Robust? Comparing the Effects of Data Modification Methods on Out-of-Domain Generalization and Adversarial RobustnessFindings (Findings), 2022

Tejas Gokhale

Swaroop Mishra

Man Luo

Bhavdeep Singh Sachdeva

Chitta Baral

205

15 Mar 2022

aaeCAPTCHA: The Design and Implementation of Audio Adversarial CAPTCHAEuropean Symposium on Security and Privacy (Euro S&P), 2022

Md. Imran Hossen

X. Hei

139

05 Mar 2022

A Survey of Multilingual Models for Automatic Speech RecognitionInternational Conference on Language Resources and Evaluation (LREC), 2022

Hemant Yadav

Sunayana Sitaram

167

25 Feb 2022

Adversarial Attacks on Speech Recognition Systems for Mission-Critical Applications: A Survey

Ngoc Dung Huynh

Mohamed Reda Bouadjenek

Imran Razzak

178

22 Feb 2022

Spanish and English Phoneme Recognition by Training on Simulated Classroom Audio Recordings of Collaborative Learning Environments

Mario Esparza

167

21 Feb 2022

Where Is My Training Bottleneck? Hidden Trade-Offs in Deep Learning Preprocessing Pipelines

325

17 Feb 2022

Mitigating Closed-model Adversarial Examples with Bayesian Neural Modeling for Enhanced End-to-End Speech RecognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

191

17 Feb 2022

Vau da muntanialas: Energy-efficient multi-die scalable acceleration of RNN inferenceIEEE Transactions on Circuits and Systems Part 1: Regular Papers (TCAS-I), 2021

G. Paulin

Francesco Conti

Lukas Cavigelli

Luca Benini

147

14 Feb 2022

I'm Hearing (Different) Voices: Anonymous Voices to Protect User Privacy

H.C.M. Turner

Giulio Lovisotto

Simon Eberz

Ivan Martinovic

13 Feb 2022

FAAG: Fast Adversarial Audio Generation through Interactive Attack Optimisation

194

11 Feb 2022

Convergence of a New Learning Algorithm

Feng Lin

3DV

104

08 Feb 2022

BEA-Base: A Benchmark for ASR of Spontaneous HungarianInternational Conference on Language Resources and Evaluation (LREC), 2022

153

01 Feb 2022

Visualizing Automatic Speech Recognition -- Means for a Better Understanding?

215

01 Feb 2022

Language Dependencies in Adversarial Attacks on Speech Recognition Systems

196

01 Feb 2022

Unicorn: Reasoning about Configurable System Performance through the lens of CausalityEuropean Conference on Computer Systems (EuroSys), 2022

Md Shahriar Iqbal

R. Krishna

Mohammad Ali Javidian

Baishakhi Ray

Pooyan Jamshidi

LRM

237

20 Jan 2022

iDECODe: In-distribution Equivariance for Conformal Out-of-distribution DetectionAAAI Conference on Artificial Intelligence (AAAI), 2022

206

07 Jan 2022

Discrete and continuous representations and processing in deep learning: Looking forwardAI Open (AO), 2022

300

04 Jan 2022

Multi-Dialect Arabic Speech RecognitionIEEE International Joint Conference on Neural Network (IJCNN), 2020

Abbas Raza Ali

25 Dec 2021

Parameter identifiability of a deep feedforward ReLU neural networkMachine-mediated learning (ML), 2021

Joachim Bona-Pellissier

François Bachoc

François Malgouyres

271

24 Dec 2021

Watch Those Words: Video Falsification Detection Using Word-Conditioned Facial MotionIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2021

Hao Li

205

21 Dec 2021

ImportantAug: a data augmentation agent for speech

V. Trinh

Hassan Salami Kavaki

Michael I. Mandel

211

14 Dec 2021