ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1508.01211
  4. Cited By
Listen, Attend and Spell
v1v2 (latest)

Listen, Attend and Spell

IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2015
5 August 2015
William Chan
Navdeep Jaitly
Quoc V. Le
Oriol Vinyals
    RALM
ArXiv (abs)PDFHTML

Papers citing "Listen, Attend and Spell"

50 / 1,064 papers shown
An Investigation of Monotonic Transducers for Large-Scale Automatic
  Speech Recognition
An Investigation of Monotonic Transducers for Large-Scale Automatic Speech RecognitionSpoken Language Technology Workshop (SLT), 2022
Niko Moritz
Frank Seide
Duc Le
Jay Mahadeokar
Christian Fuegen
371
10
0
19 Apr 2022
Self-critical Sequence Training for Automatic Speech Recognition
Self-critical Sequence Training for Automatic Speech RecognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Chen Chen
Yuchen Hu
Nana Hou
Xiaofeng Qi
Heqing Zou
Chng Eng Siong
165
17
0
13 Apr 2022
Tokenwise Contrastive Pretraining for Finer Speech-to-BERT Alignment in
  End-to-End Speech-to-Intent Systems
Tokenwise Contrastive Pretraining for Finer Speech-to-BERT Alignment in End-to-End Speech-to-Intent SystemsInterspeech (Interspeech), 2022
Vishal Sunder
Eric Fosler-Lussier
Samuel Thomas
H. Kuo
Brian Kingsbury
238
8
0
11 Apr 2022
Adding Connectionist Temporal Summarization into Conformer to Improve
  Its Decoder Efficiency For Speech Recognition
Adding Connectionist Temporal Summarization into Conformer to Improve Its Decoder Efficiency For Speech Recognition
N. J. Wang
Zongfeng Quan
Shaojun Wang
Jing Xiao
123
1
0
08 Apr 2022
A Complementary Joint Training Approach Using Unpaired Speech and Text
  for Low-Resource Automatic Speech Recognition
A Complementary Joint Training Approach Using Unpaired Speech and Text for Low-Resource Automatic Speech Recognition
Ye Du
Jie Zhang
Qiu-shi Zhu
Lirong Dai
Ming Wu
Xin Fang
Zhouwang Yang
146
2
0
05 Apr 2022
Class-Incremental Learning by Knowledge Distillation with Adaptive
  Feature Consolidation
Class-Incremental Learning by Knowledge Distillation with Adaptive Feature ConsolidationComputer Vision and Pattern Recognition (CVPR), 2022
Minsoo Kang
Jaeyoo Park
Bohyung Han
CLL
249
230
0
02 Apr 2022
Leveraging Phone Mask Training for Phonetic-Reduction-Robust E2E Uyghur
  Speech Recognition
Leveraging Phone Mask Training for Phonetic-Reduction-Robust E2E Uyghur Speech RecognitionInterspeech (Interspeech), 2021
Guodong Ma
Pengfei Hu
Jian Kang
Shen Huang
Hao-Ming Huang
145
13
0
02 Apr 2022
Multi-task RNN-T with Semantic Decoder for Streamable Spoken Language
  Understanding
Multi-task RNN-T with Semantic Decoder for Streamable Spoken Language UnderstandingIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Xuandi Fu
Feng-Ju Chang
Martin H. Radfar
Kailin Wei
Jing Liu
Grant P. Strimel
Kanthashree Mysore Sathyendra
134
4
0
01 Apr 2022
Memory-Efficient Training of RNN-Transducer with Sampled Softmax
Memory-Efficient Training of RNN-Transducer with Sampled SoftmaxInterspeech (Interspeech), 2022
Jaesong Lee
Lukas Lee
Shinji Watanabe
286
8
0
31 Mar 2022
Open Source MagicData-RAMC: A Rich Annotated Mandarin
  Conversational(RAMC) Speech Dataset
Open Source MagicData-RAMC: A Rich Annotated Mandarin Conversational(RAMC) Speech DatasetInterspeech (Interspeech), 2022
Zehui Yang
Yifan Chen
Lei Luo
Runyan Yang
Lingxuan Ye
...
Yaohui Jin
Qingqing Zhang
Pengyuan Zhang
Lei Xie
Yonghong Yan
147
71
0
31 Mar 2022
NeuFA: Neural Network Based End-to-End Forced Alignment with
  Bidirectional Attention Mechanism
NeuFA: Neural Network Based End-to-End Forced Alignment with Bidirectional Attention MechanismIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Jingbei Li
Yi Meng
Zhiyong Wu
Helen Meng
Qiao Tian
Yuping Wang
Yuxuan Wang
93
28
0
31 Mar 2022
CUSIDE: Chunking, Simulating Future Context and Decoding for Streaming
  ASR
CUSIDE: Chunking, Simulating Future Context and Decoding for Streaming ASRInterspeech (Interspeech), 2022
Keyu An
Huahuan Zheng
Zhijian Ou
Hongyu Xiang
Ke Ding
Guanglu Wan
AI4TS
172
21
0
31 Mar 2022
Enhancing Zero-Shot Many to Many Voice Conversion with Self-Attention
  VAE
Enhancing Zero-Shot Many to Many Voice Conversion with Self-Attention VAEInternational Conference on Artificial Intelligence for Industries (ICAII), 2022
Ziang Long
Yunling Zheng
Meng Yu
Jack Xin
DRL
154
6
0
30 Mar 2022
Recent improvements of ASR models in the face of adversarial attacks
Recent improvements of ASR models in the face of adversarial attacksInterspeech (Interspeech), 2022
R. Olivier
Bhiksha Raj
AAML
260
18
0
29 Mar 2022
Streaming parallel transducer beam search with fast-slow cascaded
  encoders
Streaming parallel transducer beam search with fast-slow cascaded encodersInterspeech (Interspeech), 2022
Jay Mahadeokar
Yangyang Shi
Ke Li
Duc Le
Jiedan Zhu
Vikas Chandra
Ozlem Kalinli
M. Seltzer
204
17
0
29 Mar 2022
Integrating Lattice-Free MMI into End-to-End Speech Recognition
Integrating Lattice-Free MMI into End-to-End Speech RecognitionIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2022
Jinchuan Tian
Jianwei Yu
Chao Weng
Yuexian Zou
Dong Yu
301
10
0
29 Mar 2022
WeNet 2.0: More Productive End-to-End Speech Recognition Toolkit
WeNet 2.0: More Productive End-to-End Speech Recognition ToolkitInterspeech (Interspeech), 2022
Binbin Zhang
Di Wu
Zhendong Peng
Xingcheng Song
Zhuoyuan Yao
Hang Lv
Linfu Xie
Chao Yang
Fuping Pan
Jianwei Niu
VLM
274
129
0
29 Mar 2022
Investigating Self-supervised Pretraining Frameworks for Pathological
  Speech Recognition
Investigating Self-supervised Pretraining Frameworks for Pathological Speech RecognitionInterspeech (Interspeech), 2022
Lester Phillip Violeta
Wen-Chin Huang
Tomoki Toda
268
44
0
29 Mar 2022
Noise-robust Speech Recognition with 10 Minutes Unparalleled In-domain
  Data
Noise-robust Speech Recognition with 10 Minutes Unparalleled In-domain DataIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Chen Chen
Nana Hou
Yuchen Hu
Shashank Shirol
Chng Eng Siong
NoLa
201
49
0
29 Mar 2022
Shifted Chunk Encoder for Transformer Based Streaming End-to-End ASR
Shifted Chunk Encoder for Transformer Based Streaming End-to-End ASRInternational Conference on Neural Information Processing (ICONIP), 2022
Fangyuan Wang
Bo Xu
169
5
0
29 Mar 2022
Finnish Parliament ASR corpus - Analysis, benchmarks and statistics
Finnish Parliament ASR corpus - Analysis, benchmarks and statisticsLanguage Resources and Evaluation (LRE), 2022
A. Virkkunen
Aku Rouhe
Nhan Phan
M. Kurimo
197
6
0
28 Mar 2022
Dual-Path Style Learning for End-to-End Noise-Robust Speech Recognition
Dual-Path Style Learning for End-to-End Noise-Robust Speech RecognitionInterspeech (Interspeech), 2022
Yuchen Hu
Nana Hou
Chen Chen
Chng Eng Siong
183
18
0
28 Mar 2022
Joint Transformer/RNN Architecture for Gesture Typing in Indic Languages
Joint Transformer/RNN Architecture for Gesture Typing in Indic LanguagesInternational Conference on Computational Linguistics (COLING), 2020
Emil Biju
Anirudh Sriram
Mitesh M. Khapra
Pratyush Kumar
108
4
0
26 Mar 2022
Lahjoita puhetta -- a large-scale corpus of spoken Finnish with some
  benchmarks
Lahjoita puhetta -- a large-scale corpus of spoken Finnish with some benchmarks
Anssi Moisio
Dejan Porjazovski
Aku Rouhe
Yaroslav Getman
A. Virkkunen
Tamás Grósz
Krister Lindén
M. Kurimo
193
24
0
24 Mar 2022
Modality Competition: What Makes Joint Training of Multi-modal Network
  Fail in Deep Learning? (Provably)
Modality Competition: What Makes Joint Training of Multi-modal Network Fail in Deep Learning? (Provably)International Conference on Machine Learning (ICML), 2022
Yu Huang
Junyang Lin
Chang Zhou
Hongxia Yang
Longbo Huang
181
145
0
23 Mar 2022
Transformer-based Streaming ASR with Cumulative Attention
Transformer-based Streaming ASR with Cumulative AttentionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Mohan Li
Shucong Zhang
Catalin Zorila
R. Doddipatla
160
11
0
11 Mar 2022
aaeCAPTCHA: The Design and Implementation of Audio Adversarial CAPTCHA
aaeCAPTCHA: The Design and Implementation of Audio Adversarial CAPTCHAEuropean Symposium on Security and Privacy (Euro S&P), 2022
Md. Imran Hossen
X. Hei
141
9
0
05 Mar 2022
Towards Contextual Spelling Correction for Customization of End-to-end
  Speech Recognition Systems
Towards Contextual Spelling Correction for Customization of End-to-end Speech Recognition SystemsIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2022
Xiaoqiang Wang
Yanqing Liu
Jinyu Li
Veljko Miljanic
Sheng Zhao
H. Khalil
KELM
232
23
0
02 Mar 2022
A Brief Overview of Unsupervised Neural Speech Representation Learning
A Brief Overview of Unsupervised Neural Speech Representation Learning
Lasse Borgholt
Jakob Drachmann Havtorn
Joakim Edin
Lars Maaløe
Christian Igel
BDLAI4TSSSL
243
13
0
01 Mar 2022
Adversarial Attacks on Speech Recognition Systems for Mission-Critical
  Applications: A Survey
Adversarial Attacks on Speech Recognition Systems for Mission-Critical Applications: A Survey
Ngoc Dung Huynh
Mohamed Reda Bouadjenek
Imran Razzak
Kevin Lee
Chetan Arora
Ali Hassani
A. Zaslavsky
AAML
182
8
0
22 Feb 2022
Speaker Adaptation Using Spectro-Temporal Deep Features for Dysarthric
  and Elderly Speech Recognition
Speaker Adaptation Using Spectro-Temporal Deep Features for Dysarthric and Elderly Speech RecognitionIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2022
Mengzhe Geng
Xurong Xie
Zi Ye
Tianzi Wang
Guinan Li
Shujie Hu
Xunying Liu
Helen Meng
227
48
0
21 Feb 2022
Learning Representations Robust to Group Shifts and Adversarial Examples
Learning Representations Robust to Group Shifts and Adversarial Examples
Ming-Chang Chiu
Xuezhe Ma
OOD
126
0
0
18 Feb 2022
End-to-end contextual asr based on posterior distribution adaptation for
  hybrid ctc/attention system
End-to-end contextual asr based on posterior distribution adaptation for hybrid ctc/attention system
Zheng Zhang
Pan Zhou
157
7
0
18 Feb 2022
Knowledge Transfer from Large-scale Pretrained Language Models to
  End-to-end Speech Recognizers
Knowledge Transfer from Large-scale Pretrained Language Models to End-to-end Speech RecognizersIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Yotaro Kubo
Shigeki Karita
M. Bacchiani
141
30
0
16 Feb 2022
Conversational Speech Recognition By Learning Conversation-level
  Characteristics
Conversational Speech Recognition By Learning Conversation-level CharacteristicsIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Kun Wei
Yike Zhang
Sining Sun
Lei Xie
Long Ma
171
10
0
16 Feb 2022
USTED: Improving ASR with a Unified Speech and Text Encoder-Decoder
USTED: Improving ASR with a Unified Speech and Text Encoder-DecoderIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Bolaji Yusuf
Ankur Gandhe
Alex Sokolov
202
11
0
12 Feb 2022
Improving Automatic Speech Recognition for Non-Native English with
  Transfer Learning and Language Model Decoding
Improving Automatic Speech Recognition for Non-Native English with Transfer Learning and Language Model Decoding
Peter Sullivan
Toshiko Shibano
Muhammad Abdul-Mageed
150
12
0
10 Feb 2022
ASRPU: A Programmable Accelerator for Low-Power Automatic Speech
  Recognition
ASRPU: A Programmable Accelerator for Low-Power Automatic Speech RecognitionSocial Science Research Network (SSRN), 2022
D. Pinto
J. Arnau
Antonio González
65
0
0
10 Feb 2022
Semantic-aware Speech to Text Transmission with Redundancy Removal
Semantic-aware Speech to Text Transmission with Redundancy Removal
Tian Han
Qianqian Yang
Zhiguo Shi
Shibo He
Zhaoyang Zhang
178
21
0
07 Feb 2022
Joint Speech Recognition and Audio Captioning
Joint Speech Recognition and Audio CaptioningIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Chaitanya Narisetty
E. Tsunoo
Xuankai Chang
Yosuke Kashiwagi
Michael Hentschel
Shinji Watanabe
145
10
0
03 Feb 2022
RescoreBERT: Discriminative Speech Recognition Rescoring with BERT
RescoreBERT: Discriminative Speech Recognition Rescoring with BERTIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Liyan Xu
Yile Gu
J. Kolehmainen
Haidar Khan
Ankur Gandhe
Ariya Rastrow
A. Stolcke
I. Bulyko
416
59
0
02 Feb 2022
BEA-Base: A Benchmark for ASR of Spontaneous Hungarian
BEA-Base: A Benchmark for ASR of Spontaneous HungarianInternational Conference on Language Resources and Evaluation (LREC), 2022
P. Mihajlik
A. Balog
T. E. Gráczi
A. Kohári
Balázs Tarján
K. Mády
154
9
0
01 Feb 2022
Transformer-based Models of Text Normalization for Speech Applications
Transformer-based Models of Text Normalization for Speech Applications
Jae Hun Ro
Felix Stahlberg
Ke Wu
Shankar Kumar
173
8
0
01 Feb 2022
Improving End-to-End Contextual Speech Recognition with Fine-Grained
  Contextual Knowledge Selection
Improving End-to-End Contextual Speech Recognition with Fine-Grained Contextual Knowledge SelectionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Minglun Han
Linhao Dong
Zhenlin Liang
Meng Cai
Shiyu Zhou
Zejun Ma
Bo Xu
165
56
0
30 Jan 2022
Reducing language context confusion for end-to-end code-switching
  automatic speech recognition
Reducing language context confusion for end-to-end code-switching automatic speech recognitionInterspeech (Interspeech), 2022
Shuai Zhang
Jiangyan Yi
Zhengkun Tian
Jianhua Tao
Y. Yeung
Liqun Deng
169
16
0
28 Jan 2022
On the Effectiveness of Pinyin-Character Dual-Decoding for End-to-End
  Mandarin Chinese ASR
On the Effectiveness of Pinyin-Character Dual-Decoding for End-to-End Mandarin Chinese ASR
Zhao Yang
Dianwen Ng
Xiao Fu
Liping Han
Wei Xi
Ruimeng Wang
Rui Jiang
Jizhong Zhao
223
3
0
26 Jan 2022
Improving the fusion of acoustic and text representations in RNN-T
Improving the fusion of acoustic and text representations in RNN-TIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Chao Zhang
Yue Liu
Zhiyun Lu
Tara N. Sainath
Shuo-yiin Chang
AI4CE
199
13
0
25 Jan 2022
Run-and-back stitch search: novel block synchronous decoding for
  streaming encoder-decoder ASR
Run-and-back stitch search: novel block synchronous decoding for streaming encoder-decoder ASRIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
E. Tsunoo
Chaitanya Narisetty
Michael Hentschel
Yosuke Kashiwagi
Shinji Watanabe
153
3
0
25 Jan 2022
Recent Progress in the CUHK Dysarthric Speech Recognition System
Recent Progress in the CUHK Dysarthric Speech Recognition SystemIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2022
Shansong Liu
Mengzhe Geng
Shoukang Hu
Xurong Xie
Mingyu Cui
Jianwei Yu
Xunying Liu
Helen Meng
161
83
0
15 Jan 2022
Spectro-Temporal Deep Features for Disordered Speech Assessment and
  Recognition
Spectro-Temporal Deep Features for Disordered Speech Assessment and RecognitionInterspeech (Interspeech), 2021
Mengzhe Geng
Shansong Liu
Jianwei Yu
Xurong Xie
Shoukang Hu
Zi Ye
Zengrui Jin
Xunying Liu
Helen Meng
120
22
0
14 Jan 2022
Previous
123...789...202122
Next
Page 8 of 22
Pageof 22