Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1508.01211
Cited By
Listen, Attend and Spell
5 August 2015
William Chan
Navdeep Jaitly
Quoc V. Le
Oriol Vinyals
RALM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Listen, Attend and Spell"
50 / 510 papers shown
Title
Exploring Transformers in Natural Language Generation: GPT, BERT, and XLNet
M. O. Topal
Anil Bas
Imke van Heerden
LLMAG
AI4CE
24
88
0
16 Feb 2021
Leveraging Acoustic and Linguistic Embeddings from Pretrained speech and language Models for Intent Classification
Bidisha Sharma
Maulik C. Madhavi
Haizhou Li
16
19
0
15 Feb 2021
Speech Recognition by Simply Fine-tuning BERT
Wen-Chin Huang
Chia-Hua Wu
Shang-Bao Luo
Kuan-Yu Chen
Hsin-Min Wang
T. Toda
70
28
0
30 Jan 2021
UniSpeech: Unified Speech Representation Learning with Labeled and Unlabeled Data
Chengyi Wang
Yu-Huan Wu
Yao Qian
K. Kumatani
Shujie Liu
Furu Wei
Michael Zeng
Xuedong Huang
OT
SSL
30
112
0
19 Jan 2021
A Survey on Deep Reinforcement Learning for Audio-Based Applications
S. Latif
Heriberto Cuayáhuitl
Farrukh Pervez
Fahad Shamshad
Hafiz Shehbaz Ali
Erik Cambria
OffRL
44
73
0
01 Jan 2021
ConvMath: A Convolutional Sequence Network for Mathematical Expression Recognition
Zuoyu Yan
Xiaode Zhang
Liangcai Gao
Ke Yuan
Zhi Tang
19
17
0
23 Dec 2020
NeurST: Neural Speech Translation Toolkit
Chengqi Zhao
Mingxuan Wang
Qianqian Dong
Rong Ye
Lei Li
22
32
0
18 Dec 2020
Less Is More: Improved RNN-T Decoding Using Limited Label Context and Path Merging
Rohit Prabhavalkar
Yanzhang He
David Rybach
S. Campbell
A. Narayanan
Trevor Strohman
Tara N. Sainath
41
35
0
12 Dec 2020
Streaming end-to-end multi-talker speech recognition
Liang Lu
Naoyuki Kanda
Jinyu Li
Y. Gong
11
41
0
26 Nov 2020
Using Synthetic Audio to Improve The Recognition of Out-Of-Vocabulary Words in End-To-End ASR Systems
Xianrui Zheng
Yulan Liu
Deniz Gunceler
D. Willett
17
78
0
23 Nov 2020
Deep Shallow Fusion for RNN-T Personalization
Duc Le
Gil Keren
Julian Chan
Jay Mahadeokar
Christian Fuegen
M. Seltzer
21
77
0
16 Nov 2020
Towards Semi-Supervised Semantics Understanding from Speech
Cheng-I Jeff Lai
Jin Cao
S. Bodapati
Shang-Wen Li
SSL
14
7
0
11 Nov 2020
A low latency ASR-free end to end spoken language understanding system
Mohamed Mhiri
Samuel Myer
Vikrant Singh Tomar
22
8
0
10 Nov 2020
Dual Application of Speech Enhancement for Automatic Speech Recognition
Ashutosh Pandey
Chunxi Liu
Yun Wang
Yatharth Saraf
33
37
0
07 Nov 2020
Improving RNN Transducer Based ASR with Auxiliary Tasks
Chunxi Liu
Frank Zhang
Duc Le
Suyoun Kim
Yatharth Saraf
Geoffrey Zweig
23
49
0
05 Nov 2020
Internal Language Model Estimation for Domain-Adaptive End-to-End Speech Recognition
Zhong Meng
S. Parthasarathy
Eric Sun
Yashesh Gaur
Naoyuki Kanda
Liang Lu
Xie Chen
Rui Zhao
Jinyu Li
Y. Gong
AuLLM
19
107
0
03 Nov 2020
HarperValleyBank: A Domain-Specific Spoken Dialog Corpus
Mike Wu
J. Nafziger
A. Scodary
Andrew L. Maas
24
17
0
26 Oct 2020
Semi-Supervised Spoken Language Understanding via Self-Supervised Speech and Language Model Pretraining
Cheng-I Jeff Lai
Yung-Sung Chuang
Hung-yi Lee
Shang-Wen Li
James R. Glass
VLM
SSL
22
58
0
26 Oct 2020
Transformer-based End-to-End Speech Recognition with Local Dense Synthesizer Attention
Menglong Xu
Shengqiang Li
Xiao-Lei Zhang
24
31
0
23 Oct 2020
Confidence Estimation for Attention-based Sequence-to-sequence Models for Speech Recognition
Qiujia Li
David Qiu
Yu Zhang
Bo-wen Li
Yanzhang He
P. Woodland
Liangliang Cao
Trevor Strohman
4
46
0
22 Oct 2020
Developing Real-time Streaming Transformer Transducer for Speech Recognition on Large-scale Dataset
Xie Chen
Yu-Huan Wu
Zhenghao Wang
Shujie Liu
Jinyu Li
17
169
0
22 Oct 2020
A General Multi-Task Learning Framework to Leverage Text Data for Speech to Text Tasks
Yun Tang
J. Pino
Changhan Wang
Xutai Ma
Dmitriy Genzel
20
73
0
21 Oct 2020
Why Layer-Wise Learning is Hard to Scale-up and a Possible Solution via Accelerated Downsampling
Wenchi Ma
Miao Yu
Kaidong Li
Guanghui Wang
14
5
0
15 Oct 2020
Representation Learning for Sequence Data with Deep Autoencoding Predictive Components
Junwen Bai
Weiran Wang
Yingbo Zhou
Caiming Xiong
SSL
AI4TS
27
12
0
07 Oct 2020
Fine-Grained Grounding for Multimodal Speech Recognition
Tejas Srinivasan
Ramon Sanabria
Florian Metze
Desmond Elliott
23
11
0
05 Oct 2020
Explaining Deep Neural Networks
Oana-Maria Camburu
XAI
FAtt
25
26
0
04 Oct 2020
On Target Segmentation for Direct Speech Translation
Mattia Antonino Di Gangi
Marco Gaido
Matteo Negri
Marco Turchi
34
14
0
10 Sep 2020
Seeing wake words: Audio-visual Keyword Spotting
Liliane Momeni
Triantafyllos Afouras
Themos Stafylakis
Samuel Albanie
Andrew Zisserman
44
43
0
02 Sep 2020
Speech To Semantics: Improve ASR and NLU Jointly via All-Neural Interfaces
Milind Rao
A. Raju
Pranav Dheram
Bach Bui
Ariya Rastrow
13
43
0
14 Aug 2020
End-to-End Neural Transformer Based Spoken Language Understanding
Martin H. Radfar
Athanasios Mouchtaris
Siegfried Kunzmann
39
61
0
12 Aug 2020
Transformer with Bidirectional Decoder for Speech Recognition
Xi Chen
Songyang Zhang
Dandan Song
P. Ouyang
Shouyi Yin
16
13
0
11 Aug 2020
Distilling the Knowledge of BERT for Sequence-to-Sequence ASR
Hayato Futami
H. Inaguma
Sei Ueno
Masato Mimura
S. Sakai
Tatsuya Kawahara
19
50
0
09 Aug 2020
LRSpeech: Extremely Low-Resource Speech Synthesis and Recognition
Jin Xu
Xu Tan
Yi Ren
Tao Qin
Jian Li
Sheng Zhao
Tie-Yan Liu
VLM
16
90
0
09 Aug 2020
An Overview of Voice Conversion and its Challenges: From Statistical Modeling to Deep Learning
Berrak Sisman
Junichi Yamagishi
Simon King
Haizhou Li
BDL
27
316
0
09 Aug 2020
Hybrid Transformer/CTC Networks for Hardware Efficient Voice Triggering
Saurabh N. Adya
Vineet Garg
Siddharth Sigtia
P. Simha
C. Dhir
15
18
0
05 Aug 2020
Modular End-to-end Automatic Speech Recognition Framework for Acoustic-to-word Model
Qi Liu
Zhehuai Chen
Hao Li
Mingkun Huang
Yizhou Lu
Kai Yu
13
6
0
31 Jul 2020
Privacy-preserving Voice Analysis via Disentangled Representations
Ranya Aloufi
Hamed Haddadi
David E. Boyle
DRL
13
58
0
29 Jul 2020
Efficient minimum word error rate training of RNN-Transducer for end-to-end speech recognition
Jinxi Guo
Gautam Tiwari
J. Droppo
Maarten Van Segbroeck
Che-Wei Huang
A. Stolcke
Roland Maas
11
55
0
27 Jul 2020
CoVoST 2 and Massively Multilingual Speech-to-Text Translation
Changhan Wang
Anne Wu
J. Pino
SLR
19
71
0
20 Jul 2020
GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding
Dmitry Lepikhin
HyoukJoong Lee
Yuanzhong Xu
Dehao Chen
Orhan Firat
Yanping Huang
M. Krikun
Noam M. Shazeer
Z. Chen
MoE
20
1,106
0
30 Jun 2020
Boosting Active Learning for Speech Recognition with Noisy Pseudo-labeled Samples
Jihwan Bang
Heesu Kim
Y. Yoo
Jung-Woo Ha
9
2
0
19 Jun 2020
Exploration of End-to-End ASR for OpenSTT -- Russian Open Speech-to-Text Dataset
A. Andrusenko
A. Laptev
Ivan Medennikov
VLM
16
12
0
15 Jun 2020
Simplified Self-Attention for Transformer-based End-to-End Speech Recognition
Haoneng Luo
Shiliang Zhang
Ming Lei
Lei Xie
27
33
0
21 May 2020
A Comparison of Label-Synchronous and Frame-Synchronous End-to-End Models for Speech Recognition
Linhao Dong
Cheng Yi
Jianzong Wang
Shiyu Zhou
Shuang Xu
X. Jia
Bo Xu
36
17
0
20 May 2020
Mask CTC: Non-Autoregressive End-to-End ASR with CTC and Mask Predict
Yosuke Higuchi
Shinji Watanabe
Nanxin Chen
Tetsuji Ogawa
Tetsunori Kobayashi
17
136
0
18 May 2020
Spike-Triggered Non-Autoregressive Transformer for End-to-End Speech Recognition
Zhengkun Tian
Jiangyan Yi
J. Tao
Ye Bai
Shuai Zhang
Zhengqi Wen
6
54
0
16 May 2020
Large scale weakly and semi-supervised learning for low-resource video ASR
Kritika Singh
Vimal Manohar
Alex Xiao
Sergey Edunov
Ross B. Girshick
Vitaliy Liptchinsky
Christian Fuegen
Yatharth Saraf
Geoffrey Zweig
Abdel-rahman Mohamed
28
9
0
16 May 2020
FaceFilter: Audio-visual speech separation using still images
Soo-Whan Chung
Soyeon Choe
Joon Son Chung
Hong-Goo Kang
CVBM
21
66
0
14 May 2020
Discriminative Multi-modality Speech Recognition
Bo Xu
Cheng Lu
Yandong Guo
Jacob Wang
18
98
0
12 May 2020
Incremental Learning for End-to-End Automatic Speech Recognition
Li Fu
Xiaoxiao Li
Libo Zi
Zhengchen Zhang
Youzheng Wu
Xiaodong He
Bowen Zhou
CLL
32
23
0
11 May 2020
Previous
1
2
3
...
10
11
7
8
9
Next