ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1508.01211
  4. Cited By
Listen, Attend and Spell
v1v2 (latest)

Listen, Attend and Spell

IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2015
5 August 2015
William Chan
Navdeep Jaitly
Quoc V. Le
Oriol Vinyals
    RALM
ArXiv (abs)PDFHTML

Papers citing "Listen, Attend and Spell"

50 / 1,064 papers shown
Neural Architecture Search For LF-MMI Trained Time Delay Neural Networks
Neural Architecture Search For LF-MMI Trained Time Delay Neural NetworksIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2022
Shou-Yong Hu
Xurong Xie
Mingyu Cui
Jiajun Deng
Shansong Liu
Jianwei Yu
Mengzhe Geng
Xunying Liu
Helen Meng
247
28
0
08 Jan 2022
Two-Pass End-to-End ASR Model Compression
Two-Pass End-to-End ASR Model CompressionAutomatic Speech Recognition & Understanding (ASRU), 2021
Nauman Dawalatabad
Tushar Vatsal
Ashutosh Gupta
Sungsoo Kim
Shatrughan Singh
Dhananjaya N. Gowda
Chanwoo Kim
86
6
0
08 Jan 2022
Sign Language Video Retrieval with Free-Form Textual Queries
Sign Language Video Retrieval with Free-Form Textual QueriesComputer Vision and Pattern Recognition (CVPR), 2022
A. Duarte
Samuel Albanie
Xavier Giró-i-Nieto
Gül Varol
SLR
222
36
0
07 Jan 2022
Improving Mandarin End-to-End Speech Recognition with Word N-gram
  Language Model
Improving Mandarin End-to-End Speech Recognition with Word N-gram Language ModelIEEE Signal Processing Letters (SPL), 2022
Jinchuan Tian
Jianwei Yu
Chao Weng
Yuexian Zou
Dong Yu
180
15
0
06 Jan 2022
Discrete and continuous representations and processing in deep learning:
  Looking forward
Discrete and continuous representations and processing in deep learning: Looking forwardAI Open (AO), 2022
Ruben Cartuyvels
Graham Spinks
Marie-Francine Moens
OCL
300
28
0
04 Jan 2022
Speech-to-SQL: Towards Speech-driven SQL Query Generation From Natural
  Language Question
Speech-to-SQL: Towards Speech-driven SQL Query Generation From Natural Language QuestionThe VLDB journal (VLDBJ), 2022
Wailing Ng
Raymond Chi-Wing Wong
Xuefang Zhao
Chen Zhang
184
19
0
04 Jan 2022
Voice Quality and Pitch Features in Transformer-Based Speech Recognition
Voice Quality and Pitch Features in Transformer-Based Speech RecognitionProceedings of the International Conference on Speech Prosody (ICSP), 2021
Guillermo Cámbara
Jordi Luque
Mireia Farrús
138
0
0
21 Dec 2021
Saliency Grafting: Innocuous Attribution-Guided Mixup with Calibrated
  Label Mixing
Saliency Grafting: Innocuous Attribution-Guided Mixup with Calibrated Label Mixing
Joonhyung Park
J. Yang
Jinwoo Shin
Sung Ju Hwang
Eunho Yang
176
26
0
16 Dec 2021
Prompt Tuning GPT-2 language model for parameter-efficient domain
  adaptation of ASR systems
Prompt Tuning GPT-2 language model for parameter-efficient domain adaptation of ASR systems
Saket Dingliwal
Ashish Shenoy
S. Bodapati
Ankur Gandhe
R. Gadde
Katrin Kirchhoff
VLM
276
4
0
16 Dec 2021
Improving Hybrid CTC/Attention End-to-end Speech Recognition with
  Pretrained Acoustic and Language Model
Improving Hybrid CTC/Attention End-to-end Speech Recognition with Pretrained Acoustic and Language Model
Keqi Deng
Songjun Cao
Yike Zhang
Long Ma
VLM
90
32
0
14 Dec 2021
PM-MMUT: Boosted Phone-Mask Data Augmentation using Multi-Modeling Unit
  Training for Phonetic-Reduction-Robust E2E Speech Recognition
PM-MMUT: Boosted Phone-Mask Data Augmentation using Multi-Modeling Unit Training for Phonetic-Reduction-Robust E2E Speech Recognition
Guodong Ma
Pengfei Hu
Nurmemet Yolwas
Shen Huang
Hao-Ming Huang
276
5
0
13 Dec 2021
Consistent Training and Decoding For End-to-end Speech Recognition Using
  Lattice-free MMI
Consistent Training and Decoding For End-to-end Speech Recognition Using Lattice-free MMIIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021
Jinchuan Tian
Jianwei Yu
Chao Weng
Shi-Xiong Zhang
Jane Polak Scowcroft
Dong Yu
Yuexian Zou
AuLLM
187
15
0
05 Dec 2021
Deliberation of Streaming RNN-Transducer by Non-autoregressive Decoding
Deliberation of Streaming RNN-Transducer by Non-autoregressive Decoding
Weiran Wang
Ke Hu
Tara N. Sainath
159
25
0
01 Dec 2021
Mixed Precision Low-bit Quantization of Neural Network Language Models
  for Speech Recognition
Mixed Precision Low-bit Quantization of Neural Network Language Models for Speech RecognitionIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2021
Junhao Xu
Jianwei Yu
Shoukang Hu
Xunying Liu
Helen Meng
MQ
254
18
0
29 Nov 2021
Lattention: Lattice-attention in ASR rescoring
Lattention: Lattice-attention in ASR rescoringIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021
Prabhat Pandey
Sergio Duarte Torres
Ali Orkan Bayer
Ankur Gandhe
Volker Leutnant
145
9
0
19 Nov 2021
A comparison of streaming models and data augmentation methods for
  robust speech recognition
A comparison of streaming models and data augmentation methods for robust speech recognitionAutomatic Speech Recognition & Understanding (ASRU), 2021
Jiyeon Kim
Mehul Kumar
Dhananjaya N. Gowda
Abhinav Garg
Chanwoo Kim
123
6
0
19 Nov 2021
Integrated Semantic and Phonetic Post-correction for Chinese Speech
  Recognition
Integrated Semantic and Phonetic Post-correction for Chinese Speech Recognition
Yi-Chang Chen
Chun-Yen Cheng
Chien-An Chen
Ming-Chieh Sung
Yi-Ren Yeh
98
9
0
16 Nov 2021
Deciphering Speech: a Zero-Resource Approach to Cross-Lingual Transfer
  in ASR
Deciphering Speech: a Zero-Resource Approach to Cross-Lingual Transfer in ASRInterspeech (Interspeech), 2021
Ondˇrej Klejch
E. Wallington
P. Bell
170
15
0
12 Nov 2021
Enhancing Backdoor Attacks with Multi-Level MMD Regularization
Enhancing Backdoor Attacks with Multi-Level MMD RegularizationIEEE Transactions on Dependable and Secure Computing (IEEE TDSC), 2021
Pengfei Xia
Hongjing Niu
Wandi Qiao
Bin Li
AAML
226
35
0
09 Nov 2021
Conformer-based Hybrid ASR System for Switchboard Dataset
Conformer-based Hybrid ASR System for Switchboard Dataset
Mohammad Zeineldeen
Jingjing Xu
Christoph Luscher
Wilfried Michel
Alexander Gerstenberger
Ralf Schluter
Hermann Ney
235
26
0
05 Nov 2021
Context-Aware Transformer Transducer for Speech Recognition
Context-Aware Transformer Transducer for Speech RecognitionAutomatic Speech Recognition & Understanding (ASRU), 2021
Feng-Ju Chang
Jing Liu
Martin H. Radfar
Athanasios Mouchtaris
M. Omologo
Ariya Rastrow
Siegfried Kunzmann
188
96
0
05 Nov 2021
Recent Advances in End-to-End Automatic Speech Recognition
Recent Advances in End-to-End Automatic Speech RecognitionAPSIPA Transactions on Signal and Information Processing (TASIP), 2021
Jinyu Li
VLM
433
427
0
02 Nov 2021
With a Little Help from my Temporal Context: Multimodal Egocentric
  Action Recognition
With a Little Help from my Temporal Context: Multimodal Egocentric Action RecognitionBritish Machine Vision Conference (BMVC), 2021
Evangelos Kazakos
Jaesung Huh
Arsha Nagrani
Andrew Zisserman
Dima Damen
EgoV
297
54
0
01 Nov 2021
Revealing and Protecting Labels in Distributed Training
Revealing and Protecting Labels in Distributed TrainingNeural Information Processing Systems (NeurIPS), 2021
Trung D. Q. Dang
Om Thakkar
Swaroop Indra Ramaswamy
Rajiv Mathews
Peter Chin
Franccoise Beaufays
104
29
0
31 Oct 2021
Pseudo-Labeling for Massively Multilingual Speech Recognition
Pseudo-Labeling for Massively Multilingual Speech RecognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021
Loren Lugosch
Tatiana Likhomanenko
Gabriel Synnaeve
R. Collobert
VLM
299
34
0
30 Oct 2021
Cross-attention conformer for context modeling in speech enhancement for
  ASR
Cross-attention conformer for context modeling in speech enhancement for ASRAutomatic Speech Recognition & Understanding (ASRU), 2021
A. Narayanan
Chung-Cheng Chiu
Tom O'Malley
Quan Wang
Yanzhang He
186
16
0
30 Oct 2021
An Investigation of Enhancing CTC Model for Triggered Attention-based
  Streaming ASR
An Investigation of Enhancing CTC Model for Triggered Attention-based Streaming ASRAsia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 2021
Huaibo Zhao
Yosuke Higuchi
Tetsuji Ogawa
Tetsunori Kobayashi
96
4
0
20 Oct 2021
Automatic Learning of Subword Dependent Model Scales
Automatic Learning of Subword Dependent Model Scales
Felix Meyer
Wilfried Michel
Mohammad Zeineldeen
Ralf Schluter
Hermann Ney
55
0
0
18 Oct 2021
Sub-word Level Lip Reading With Visual Attention
Sub-word Level Lip Reading With Visual Attention
Prajwal K R
Triantafyllos Afouras
Andrew Zisserman
225
111
0
14 Oct 2021
On Language Model Integration for RNN Transducer based Speech
  Recognition
On Language Model Integration for RNN Transducer based Speech Recognition
Wei Zhou
Zuoyun Zheng
Ralf Schluter
Hermann Ney
268
27
0
13 Oct 2021
Reason induced visual attention for explainable autonomous driving
Reason induced visual attention for explainable autonomous driving
Sikai Chen
Jiqian Dong
Runjia Du
Yujie Li
Samuel Labi
150
2
0
11 Oct 2021
A Comparative Study on Non-Autoregressive Modelings for Speech-to-Text
  Generation
A Comparative Study on Non-Autoregressive Modelings for Speech-to-Text GenerationAutomatic Speech Recognition & Understanding (ASRU), 2021
Yosuke Higuchi
Nanxin Chen
Yuya Fujita
Hirofumi Inaguma
Tatsuya Komatsu
Jaesong Lee
Jumon Nozaki
Tianzi Wang
Shinji Watanabe
135
49
0
11 Oct 2021
K-Wav2vec 2.0: Automatic Speech Recognition based on Joint Decoding of
  Graphemes and Syllables
K-Wav2vec 2.0: Automatic Speech Recognition based on Joint Decoding of Graphemes and SyllablesInterspeech (Interspeech), 2021
Jounghee Kim
Pilsung Kang
VLM
120
7
0
11 Oct 2021
Advancing Momentum Pseudo-Labeling with Conformer and Initialization
  Strategy
Advancing Momentum Pseudo-Labeling with Conformer and Initialization StrategyIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021
Yosuke Higuchi
Niko Moritz
Jonathan Le Roux
Takaaki Hori
184
13
0
11 Oct 2021
Have best of both worlds: two-pass hybrid and E2E cascading framework
  for speech recognition
Have best of both worlds: two-pass hybrid and E2E cascading framework for speech recognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021
Guoli Ye
V. Mazalov
Jinyu Li
Jiawei Liu
171
9
0
10 Oct 2021
SCaLa: Supervised Contrastive Learning for End-to-End Speech Recognition
SCaLa: Supervised Contrastive Learning for End-to-End Speech RecognitionInterspeech (Interspeech), 2021
Li Fu
Xiaoxiao Li
Runyu Wang
Lu Fan
Zhengchen Zhang
Meng Chen
Youzheng Wu
Xiaodong He
SSL
156
3
0
08 Oct 2021
Hierarchical Conditional End-to-End ASR with CTC and Multi-Granular
  Subword Units
Hierarchical Conditional End-to-End ASR with CTC and Multi-Granular Subword UnitsIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021
Yosuke Higuchi
Keita Karube
Tetsuji Ogawa
Tetsunori Kobayashi
170
28
0
08 Oct 2021
Improving Pseudo-label Training For End-to-end Speech Recognition Using
  Gradient Mask
Improving Pseudo-label Training For End-to-end Speech Recognition Using Gradient MaskIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021
Shaoshi Ling
Chen Shen
Meng Cai
Zejun Ma
VLMSSL
133
10
0
08 Oct 2021
Explaining the Attention Mechanism of End-to-End Speech Recognition
  Using Decision Trees
Explaining the Attention Mechanism of End-to-End Speech Recognition Using Decision Trees
Yuanchao Wang
Wenjing Du
Chenghao Cai
Yanyan Xu
145
1
0
08 Oct 2021
WenetSpeech: A 10000+ Hours Multi-domain Mandarin Corpus for Speech
  Recognition
WenetSpeech: A 10000+ Hours Multi-domain Mandarin Corpus for Speech Recognition
Binbin Zhang
Hang Lv
Pengcheng Guo
Qijie Shao
Chao Yang
...
Hui Bu
Xiaoyu Chen
Chenchen Zeng
Di Wu
Zhendong Peng
407
289
0
07 Oct 2021
BERT Attends the Conversation: Improving Low-Resource Conversational ASR
BERT Attends the Conversation: Improving Low-Resource Conversational ASR
Pablo Ortiz
Simen Burud
131
5
0
05 Oct 2021
ASR Rescoring and Confidence Estimation with ELECTRA
ASR Rescoring and Confidence Estimation with ELECTRA
Hayato Futami
Hirofumi Inaguma
Masato Mimura
S. Sakai
Tatsuya Kawahara
KELM
211
22
0
05 Oct 2021
Multi-axis Attentive Prediction for Sparse EventData: An Application to
  Crime Prediction
Multi-axis Attentive Prediction for Sparse EventData: An Application to Crime Prediction
Yi Sui
Ga Wu
Scott Sanner
123
2
0
05 Oct 2021
Fast Contextual Adaptation with Neural Associative Memory for On-Device
  Personalized Speech Recognition
Fast Contextual Adaptation with Neural Associative Memory for On-Device Personalized Speech Recognition
Tsendsuren Munkhdalai
K. Sim
Angad Chandorkar
Fan Gao
Mason Chua
Trevor Strohman
F. Beaufays
224
39
0
05 Oct 2021
Towards efficient end-to-end speech recognition with
  biologically-inspired neural networks
Towards efficient end-to-end speech recognition with biologically-inspired neural networks
Thomas Bohnstingl
Ayush Garg
Stanislaw Wo'zniak
G. Saon
E. Eleftheriou
A. Pantazi
178
5
0
04 Oct 2021
Speech Technology for Everyone: Automatic Speech Recognition for
  Non-Native English with Transfer Learning
Speech Technology for Everyone: Automatic Speech Recognition for Non-Native English with Transfer Learning
Toshiko Shibano
Xinyi Zhang
Miao Li
Haejin Cho
Peter Sullivan
Muhammad Abdul-Mageed
VLM
229
18
0
01 Oct 2021
Multimodal Emotion Recognition with High-level Speech and Text Features
Multimodal Emotion Recognition with High-level Speech and Text Features
M. R. Makiuchi
Kuniaki Uto
Koichi Shinoda
226
84
0
29 Sep 2021
Word-level confidence estimation for RNN transducers
Word-level confidence estimation for RNN transducers
Mingqiu Wang
H. Soltau
Laurent El Shafey
Izhak Shafran
UQCV
163
7
0
28 Sep 2021
Private Language Model Adaptation for Speech Recognition
Private Language Model Adaptation for Speech Recognition
Zhe Liu
Ke Li
Shreyan Bakshi
Fuchun Peng
259
6
0
28 Sep 2021
Factorized Neural Transducer for Efficient Language Model Adaptation
Factorized Neural Transducer for Efficient Language Model AdaptationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021
Xie Chen
Zhong Meng
S. Parthasarathy
Jinyu Li
497
44
0
27 Sep 2021
Previous
123...8910...202122
Next