Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1508.01211
Cited By
v1
v2 (latest)
Listen, Attend and Spell
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2015
5 August 2015
William Chan
Navdeep Jaitly
Quoc V. Le
Oriol Vinyals
RALM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Listen, Attend and Spell"
50 / 1,064 papers shown
CTC Alignments Improve Autoregressive Translation
Conference of the European Chapter of the Association for Computational Linguistics (EACL), 2022
Brian Yan
Siddharth Dalmia
Yosuke Higuchi
Graham Neubig
Florian Metze
A. Black
Shinji Watanabe
182
36
0
11 Oct 2022
DeepPerform: An Efficient Approach for Performance Testing of Resource-Constrained Neural Networks
International Conference on Automated Software Engineering (ASE), 2022
Simin Chen
Mirazul Haque
Cong Liu
Wei Yang
209
24
0
10 Oct 2022
JoeyS2T: Minimalistic Speech-to-Text Modeling with JoeyNMT
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Mayumi Ohta
Julia Kreutzer
Stefan Riezler
166
0
0
05 Oct 2022
Relaxed Attention for Transformer Models
IEEE International Joint Conference on Neural Network (IJCNN), 2022
Timo Lohrenz
Björn Möller
Zhengyang Li
Tim Fingscheidt
KELM
173
13
0
20 Sep 2022
Watch What You Pretrain For: Targeted, Transferable Adversarial Examples on Self-Supervised Speech Recognition models
R. Olivier
H. Abdullah
Bhiksha Raj
AAML
267
1
0
17 Sep 2022
Parameter-Efficient Conformers via Sharing Sparsely-Gated Experts for End-to-End Speech Recognition
Interspeech (Interspeech), 2022
Ye Bai
Jie Li
W. Han
Hao Ni
Kaituo Xu
Zhuo Zhang
Cheng Yi
Xiaorui Wang
MoE
140
3
0
17 Sep 2022
Analysis of Self-Attention Head Diversity for Conformer-based Automatic Speech Recognition
Interspeech (Interspeech), 2022
Kartik Audhkhasi
Yinghui Huang
Bhuvana Ramabhadran
Pedro J. Moreno
128
5
0
13 Sep 2022
Non-autoregressive Error Correction for CTC-based ASR with Phone-conditioned Masked LM
Interspeech (Interspeech), 2022
Hayato Futami
Hirofumi Inaguma
Sei Ueno
Masato Mimura
S. Sakai
Tatsuya Kawahara
KELM
248
13
0
08 Sep 2022
Distilling the Knowledge of BERT for CTC-based ASR
Hayato Futami
Hirofumi Inaguma
Masato Mimura
S. Sakai
Tatsuya Kawahara
189
11
0
05 Sep 2022
Vision-Language Adaptive Mutual Decoder for OOV-STR
Jinshui Hu
Chenyu Liu
Qiandong Yan
Xuyang Zhu
Jiajia Wu
Feng Yu
Bing Yin
VLM
273
1
0
02 Sep 2022
Bayesian Neural Network Language Modeling for Speech Recognition
IEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2022
Boyang Xue
Shoukang Hu
Junhao Xu
Mengzhe Geng
Xunying Liu
Helen M. Meng
UQCV
BDL
264
23
0
28 Aug 2022
Interpretable Multimodal Emotion Recognition using Hybrid Fusion of Speech and Image Data
Puneet Kumar
Sarthak Malik
Balasubramanian Raman
CVBM
200
35
0
25 Aug 2022
Comparison and Analysis of New Curriculum Criteria for End-to-End ASR
Interspeech (Interspeech), 2022
Georgios Karakasidis
Tamás Grósz
M. Kurimo
133
3
0
10 Aug 2022
ASR Error Correction with Constrained Decoding on Operation Prediction
Interspeech (Interspeech), 2022
J. Yang
Rong-Zhi Li
Wei Peng
192
12
0
09 Aug 2022
Adversarial Attacks on ASR Systems: An Overview
International Conference on Data Science in Cyberspace (ICDSC), 2022
Xiao Zhang
Hao Tan
Xuan Huang
Denghui Zhang
Keke Tang
Zhaoquan Gu
AAML
132
3
0
03 Aug 2022
VQ-T: RNN Transducers using Vector-Quantized Prediction Network States
Interspeech (Interspeech), 2022
Jiatong Shi
G. Saon
David Haws
Shinji Watanabe
Brian Kingsbury
153
3
0
03 Aug 2022
Pronunciation-aware unique character encoding for RNN Transducer-based Mandarin speech recognition
Spoken Language Technology Workshop (SLT), 2022
Peng Shen
Xugang Lu
Hisashi Kawai
106
2
0
29 Jul 2022
Improving Mandarin Speech Recogntion with Block-augmented Transformer
Xiaoming Ren
Huifeng Zhu
Liuwei Wei
Minghui Wu
Jie Hao
230
12
0
24 Jul 2022
Reducing Geographic Disparities in Automatic Speech Recognition via Elastic Weight Consolidation
Interspeech (Interspeech), 2022
V. Trinh
Pegah Ghahremani
Brian King
J. Droppo
A. Stolcke
Roland Maas
MoMe
107
7
0
16 Jul 2022
PoLyScriber: Integrated Fine-tuning of Extractor and Lyrics Transcriber for Polyphonic Music
IEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2022
Xiaoxue Gao
Chitralekha Gupta
Haizhou Li
273
9
0
15 Jul 2022
Branchformer: Parallel MLP-Attention Architectures to Capture Local and Global Context for Speech Recognition and Understanding
International Conference on Machine Learning (ICML), 2022
Yifan Peng
Siddharth Dalmia
Ian Lane
Shinji Watanabe
271
193
0
06 Jul 2022
DEFORMER: Coupling Deformed Localized Patterns with Global Context for Robust End-to-end Speech Recognition
Interspeech (Interspeech), 2022
Jiamin Xie
John H. L. Hansen
172
1
0
04 Jul 2022
Tree-constrained Pointer Generator with Graph Neural Network Encodings for Contextual Speech Recognition
Interspeech (Interspeech), 2022
Guangzhi Sun
Chuxu Zhang
P. Woodland
154
18
0
02 Jul 2022
FitHuBERT: Going Thinner and Deeper for Knowledge Distillation of Speech Self-Supervised Learning
Yeonghyeon Lee
Kangwook Jang
Jahyun Goo
Youngmoon Jung
Hoi-Rim Kim
248
39
0
01 Jul 2022
Language-specific Characteristic Assistance for Code-switching Speech Recognition
Interspeech (Interspeech), 2022
Tongtong Song
Qiang Xu
Meng Ge
Longbiao Wang
Hao Shi
Yongjie Lv
Yuqin Lin
Jianwu Dang
195
35
0
29 Jun 2022
Contextual Density Ratio for Language Model Biasing of Sequence to Sequence ASR Systems
Interspeech (Interspeech), 2021
Jesús Andrés-Ferrer
Dario Albesano
P. Zhan
Paul Vozila
106
6
0
29 Jun 2022
On Comparison of Encoders for Attention based End to End Speech Recognition in Standalone and Rescoring Mode
International Conference on Signal Processing and Communications (ICSPC), 2022
Raviraj Joshi
Subodh Kumar
111
2
0
26 Jun 2022
Towards Green ASR: Lossless 4-bit Quantization of a Hybrid TDNN System on the 300-hr Switchboard Corpus
Interspeech (Interspeech), 2022
Junhao Xu
Shoukang Hu
Xunying Liu
Helen M. Meng
MQ
214
5
0
23 Jun 2022
Two-pass Decoding and Cross-adaptation Based System Combination of End-to-end Conformer and Hybrid TDNN ASR Systems
Interspeech (Interspeech), 2022
Mingyu Cui
Jiajun Deng
Shoukang Hu
Xurong Xie
Tianzi Wang
Shujie Hu
Mengzhe Geng
Boyang Xue
Xunying Liu
Helen M. Meng
145
10
0
23 Jun 2022
Boosting Cross-Domain Speech Recognition with Self-Supervision
IEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2022
Hanjing Zhu
Gaofeng Cheng
Yongfeng Zhang
Wenxin Hou
Pengyuan Zhang
Yonghong Yan
344
22
0
20 Jun 2022
Avoid Overfitting User Specific Information in Federated Keyword Spotting
Interspeech (Interspeech), 2022
Xin-Chun Li
Jin-Lin Tang
Shaoming Song
Bingshuai Li
Yinchuan Li
Yunfeng Shao
Le Gan
De-Chuan Zhan
FedML
AAML
143
9
0
17 Jun 2022
Paraformer: Fast and Accurate Parallel Transformer for Non-autoregressive End-to-End Speech Recognition
Interspeech (Interspeech), 2022
Zhifu Gao
Shiliang Zhang
Ian Mcloughlin
Zhijie Yan
237
181
0
16 Jun 2022
Residual Language Model for End-to-end Speech Recognition
Interspeech (Interspeech), 2022
E. Tsunoo
Yosuke Kashiwagi
Chaitanya Narisetty
Shinji Watanabe
148
11
0
15 Jun 2022
LegoNN: Building Modular Encoder-Decoder Models
IEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2022
Siddharth Dalmia
Dmytro Okhonko
M. Lewis
Sergey Edunov
Shinji Watanabe
Florian Metze
Luke Zettlemoyer
Abdel-rahman Mohamed
AuLLM
MoE
176
16
0
07 Jun 2022
Contextual Adapters for Personalized Speech Recognition in Neural Transducers
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Kanthashree Mysore Sathyendra
Thejaswi Muniyappa
Feng-Ju Chang
Jing Liu
Jinru Su
Grant P. Strimel
Athanasios Mouchtaris
Siegfried Kunzmann
188
87
0
26 May 2022
Transcormer: Transformer for Sentence Scoring with Sliding Language Modeling
Neural Information Processing Systems (NeurIPS), 2022
Kaitao Song
Yichong Leng
Xu Tan
Yicheng Zou
Tao Qin
Dongsheng Li
235
11
0
25 May 2022
Adaptive multilingual speech recognition with pretrained models
Interspeech (Interspeech), 2022
Ngoc-Quan Pham
A. Waibel
Jan Niehues
VLM
210
25
0
24 May 2022
Multi-Level Modeling Units for End-to-End Mandarin Speech Recognition
International Symposium on Chinese Spoken Language Processing (ISCSLP), 2022
Yuting Yang
Binbin Du
Yuke Li
329
2
0
24 May 2022
Deep Learning for Visual Speech Analysis: A Survey
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022
Changchong Sheng
Gangyao Kuang
L. Bai
Chen Hou
Yike Guo
Xin Xu
M. Pietikäinen
Tianpeng Liu
VLM
314
53
0
22 May 2022
Minimising Biasing Word Errors for Contextual ASR with the Tree-Constrained Pointer Generator
IEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2022
Guangzhi Sun
Chuxu Zhang
P. Woodland
202
16
0
18 May 2022
Evaluating Membership Inference Through Adversarial Robustness
Computer/law journal (JITPL), 2022
Zhaoxi Zhang
L. Zhang
Xufei Zheng
Bilal Hussain Abbasi
Shengshan Hu
AAML
202
19
0
14 May 2022
Improved Consistency Training for Semi-Supervised Sequence-to-Sequence ASR via Speech Chain Reconstruction and Self-Transcribing
Interspeech (Interspeech), 2022
Heli Qi
Sashi Novitasari
S. Sakti
Satoshi Nakamura
AI4TS
252
2
0
14 May 2022
Personalized Adversarial Data Augmentation for Dysarthric and Elderly Speech Recognition
IEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2022
Zengrui Jin
Mengzhe Geng
Jiajun Deng
Tianzi Wang
Shujie Hu
Guinan Li
Xunying Liu
226
38
0
13 May 2022
Wav2Seq: Pre-training Speech-to-Text Encoder-Decoder Models Using Pseudo Languages
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Felix Wu
Kwangyoun Kim
Shinji Watanabe
Kyu Jeong Han
Ryan T. McDonald
Kilian Q. Weinberger
Yoav Artzi
SyDa
216
46
0
02 May 2022
How does a spontaneously speaking conversational agent affect user behavior?
IEEE Access (IEEE Access), 2022
Takahisa Iizuka
H. Mori
43
4
0
02 May 2022
Bilingual End-to-End ASR with Byte-Level Subwords
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Liuhui Deng
Roger Hsiao
Arnab Ghoshal
149
6
0
01 May 2022
Attention Mechanism in Neural Networks: Where it Comes and Where it Goes
Derya Soydaner
3DV
279
290
0
27 Apr 2022
Supervised Attention in Sequence-to-Sequence Models for Speech Recognition
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Gene-Ping Yang
Hao Tang
121
6
0
25 Apr 2022
Efficient Training of Neural Transducer for Speech Recognition
Interspeech (Interspeech), 2022
Wei Zhou
Wilfried Michel
Ralf Schluter
Hermann Ney
AI4TS
188
28
0
22 Apr 2022
Cross-stitched Multi-modal Encoders
Karan Singla
Daniel Pressel
Ryan Price
Bhargav Srinivas Chinnari
Yeon-Jun Kim
S. Bangalore
161
0
0
20 Apr 2022
Previous
1
2
3
...
6
7
8
...
20
21
22
Next