v1v2 (latest)

Listen, Attend and Spell

IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2015

5 August 2015

Papers citing "Listen, Attend and Spell"

50 / 1,064 papers shown

CTC Alignments Improve Autoregressive TranslationConference of the European Chapter of the Association for Computational Linguistics (EACL), 2022

Graham Neubig

182

11 Oct 2022

DeepPerform: An Efficient Approach for Performance Testing of Resource-Constrained Neural NetworksInternational Conference on Automated Software Engineering (ASE), 2022

209

10 Oct 2022

JoeyS2T: Minimalistic Speech-to-Text Modeling with JoeyNMTConference on Empirical Methods in Natural Language Processing (EMNLP), 2022

Mayumi Ohta

Julia Kreutzer

Stefan Riezler

166

05 Oct 2022

Relaxed Attention for Transformer ModelsIEEE International Joint Conference on Neural Network (IJCNN), 2022

173

20 Sep 2022

Watch What You Pretrain For: Targeted, Transferable Adversarial Examples on Self-Supervised Speech Recognition models

R. Olivier

H. Abdullah

Bhiksha Raj

AAML

267

17 Sep 2022

Parameter-Efficient Conformers via Sharing Sparsely-Gated Experts for End-to-End Speech RecognitionInterspeech (Interspeech), 2022

140

17 Sep 2022

Analysis of Self-Attention Head Diversity for Conformer-based Automatic Speech RecognitionInterspeech (Interspeech), 2022

Kartik Audhkhasi

Yinghui Huang

Bhuvana Ramabhadran

Pedro J. Moreno

128

13 Sep 2022

Non-autoregressive Error Correction for CTC-based ASR with Phone-conditioned Masked LMInterspeech (Interspeech), 2022

248

08 Sep 2022

Distilling the Knowledge of BERT for CTC-based ASR

189

05 Sep 2022

Vision-Language Adaptive Mutual Decoder for OOV-STR

273

02 Sep 2022

Bayesian Neural Network Language Modeling for Speech RecognitionIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2022

264

28 Aug 2022

Interpretable Multimodal Emotion Recognition using Hybrid Fusion of Speech and Image Data

Puneet Kumar

Sarthak Malik

Balasubramanian Raman

CVBM

200

25 Aug 2022

Comparison and Analysis of New Curriculum Criteria for End-to-End ASRInterspeech (Interspeech), 2022

Georgios Karakasidis

Tamás Grósz

M. Kurimo

133

10 Aug 2022

ASR Error Correction with Constrained Decoding on Operation PredictionInterspeech (Interspeech), 2022

J. Yang

Rong-Zhi Li

Wei Peng

192

09 Aug 2022

Adversarial Attacks on ASR Systems: An OverviewInternational Conference on Data Science in Cyberspace (ICDSC), 2022

132

03 Aug 2022

VQ-T: RNN Transducers using Vector-Quantized Prediction Network StatesInterspeech (Interspeech), 2022

Jiatong Shi

153

03 Aug 2022

Pronunciation-aware unique character encoding for RNN Transducer-based Mandarin speech recognitionSpoken Language Technology Workshop (SLT), 2022

Peng Shen

Xugang Lu

Hisashi Kawai

106

29 Jul 2022

Improving Mandarin Speech Recogntion with Block-augmented Transformer

230

24 Jul 2022

Reducing Geographic Disparities in Automatic Speech Recognition via Elastic Weight ConsolidationInterspeech (Interspeech), 2022

107

16 Jul 2022

PoLyScriber: Integrated Fine-tuning of Extractor and Lyrics Transcriber for Polyphonic MusicIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2022

Xiaoxue Gao

Chitralekha Gupta

Haizhou Li

273

15 Jul 2022

Branchformer: Parallel MLP-Attention Architectures to Capture Local and Global Context for Speech Recognition and UnderstandingInternational Conference on Machine Learning (ICML), 2022

271

193

06 Jul 2022

DEFORMER: Coupling Deformed Localized Patterns with Global Context for Robust End-to-end Speech RecognitionInterspeech (Interspeech), 2022

Jiamin Xie

John H. L. Hansen

172

04 Jul 2022

Tree-constrained Pointer Generator with Graph Neural Network Encodings for Contextual Speech RecognitionInterspeech (Interspeech), 2022

Guangzhi Sun

Chuxu Zhang

P. Woodland

154

02 Jul 2022

FitHuBERT: Going Thinner and Deeper for Knowledge Distillation of Speech Self-Supervised Learning

248

01 Jul 2022

Language-specific Characteristic Assistance for Code-switching Speech RecognitionInterspeech (Interspeech), 2022

Tongtong Song

Qiang Xu

Meng Ge

Longbiao Wang

Hao Shi

Yongjie Lv

Yuqin Lin

Jianwu Dang

195

29 Jun 2022

Contextual Density Ratio for Language Model Biasing of Sequence to Sequence ASR SystemsInterspeech (Interspeech), 2021

106

29 Jun 2022

On Comparison of Encoders for Attention based End to End Speech Recognition in Standalone and Rescoring ModeInternational Conference on Signal Processing and Communications (ICSPC), 2022

Raviraj Joshi

Subodh Kumar

111

26 Jun 2022

Towards Green ASR: Lossless 4-bit Quantization of a Hybrid TDNN System on the 300-hr Switchboard CorpusInterspeech (Interspeech), 2022

214

23 Jun 2022

Two-pass Decoding and Cross-adaptation Based System Combination of End-to-end Conformer and Hybrid TDNN ASR SystemsInterspeech (Interspeech), 2022

Mingyu Cui

Jiajun Deng

Shoukang Hu

Xurong Xie

Tianzi Wang

Shujie Hu

145

23 Jun 2022

Boosting Cross-Domain Speech Recognition with Self-SupervisionIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2022

Pengyuan Zhang

344

20 Jun 2022

Avoid Overfitting User Specific Information in Federated Keyword SpottingInterspeech (Interspeech), 2022

Xin-Chun Li

143

17 Jun 2022

Paraformer: Fast and Accurate Parallel Transformer for Non-autoregressive End-to-End Speech RecognitionInterspeech (Interspeech), 2022

237

181

16 Jun 2022

Residual Language Model for End-to-end Speech RecognitionInterspeech (Interspeech), 2022

148

15 Jun 2022

LegoNN: Building Modular Encoder-Decoder ModelsIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2022

Sergey Edunov

Luke Zettlemoyer

176

07 Jun 2022

Contextual Adapters for Personalized Speech Recognition in Neural TransducersIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

Kanthashree Mysore Sathyendra

Athanasios Mouchtaris

Siegfried Kunzmann

188

26 May 2022

Transcormer: Transformer for Sentence Scoring with Sliding Language ModelingNeural Information Processing Systems (NeurIPS), 2022

Kaitao Song

Yichong Leng

Xu Tan

Yicheng Zou

Tao Qin

Dongsheng Li

235

25 May 2022

Adaptive multilingual speech recognition with pretrained modelsInterspeech (Interspeech), 2022

210

24 May 2022

Multi-Level Modeling Units for End-to-End Mandarin Speech RecognitionInternational Symposium on Chinese Spoken Language Processing (ISCSLP), 2022

Yuting Yang

Binbin Du

Yuke Li

329

24 May 2022

Deep Learning for Visual Speech Analysis: A SurveyIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022

314

22 May 2022

Minimising Biasing Word Errors for Contextual ASR with the Tree-Constrained Pointer GeneratorIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2022

Guangzhi Sun

Chuxu Zhang

P. Woodland

202

18 May 2022

Evaluating Membership Inference Through Adversarial RobustnessComputer/law journal (JITPL), 2022

202

14 May 2022

Improved Consistency Training for Semi-Supervised Sequence-to-Sequence ASR via Speech Chain Reconstruction and Self-TranscribingInterspeech (Interspeech), 2022

252

14 May 2022

Personalized Adversarial Data Augmentation for Dysarthric and Elderly Speech RecognitionIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2022

Zengrui Jin

Mengzhe Geng

Jiajun Deng

Tianzi Wang

Shujie Hu

Guinan Li

Xunying Liu

226

13 May 2022

Wav2Seq: Pre-training Speech-to-Text Encoder-Decoder Models Using Pseudo LanguagesIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

Kwangyoun Kim

216

02 May 2022

How does a spontaneously speaking conversational agent affect user behavior?IEEE Access (IEEE Access), 2022

Takahisa Iizuka

H. Mori

02 May 2022

Bilingual End-to-End ASR with Byte-Level SubwordsIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

Liuhui Deng

Roger Hsiao

Arnab Ghoshal

149

01 May 2022

Attention Mechanism in Neural Networks: Where it Comes and Where it Goes

Derya Soydaner

3DV

279

290

27 Apr 2022

Supervised Attention in Sequence-to-Sequence Models for Speech RecognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

Gene-Ping Yang

Hao Tang

121

25 Apr 2022

Efficient Training of Neural Transducer for Speech RecognitionInterspeech (Interspeech), 2022

188

22 Apr 2022

Cross-stitched Multi-modal Encoders

Karan Singla

Daniel Pressel

Ryan Price

Bhargav Srinivas Chinnari

Yeon-Jun Kim

S. Bangalore

161

20 Apr 2022