ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1508.01211
  4. Cited By
Listen, Attend and Spell

Listen, Attend and Spell

5 August 2015
William Chan
Navdeep Jaitly
Quoc V. Le
Oriol Vinyals
    RALM
ArXivPDFHTML

Papers citing "Listen, Attend and Spell"

50 / 510 papers shown
Title
AutoSpeech: Neural Architecture Search for Speaker Recognition
AutoSpeech: Neural Architecture Search for Speaker Recognition
Shaojin Ding
Tianlong Chen
Xinyu Gong
Weiwei Zha
Zhangyang Wang
15
57
0
07 May 2020
Exploring Pre-training with Alignments for RNN Transducer based
  End-to-End Speech Recognition
Exploring Pre-training with Alignments for RNN Transducer based End-to-End Speech Recognition
Hu Hu
Rui Zhao
Jinyu Li
Liang Lu
Y. Gong
11
27
0
01 May 2020
Multiresolution and Multimodal Speech Recognition with Transformers
Multiresolution and Multimodal Speech Recognition with Transformers
Georgios Paraskevopoulos
Srinivas Parthasarathy
Aparna Khare
Shiva Sundaram
18
29
0
29 Apr 2020
Transliteration of Judeo-Arabic Texts into Arabic Script Using Recurrent
  Neural Networks
Transliteration of Judeo-Arabic Texts into Arabic Script Using Recurrent Neural Networks
Ori Terner
Kfir Bar
Nachum Dershowitz
14
3
0
23 Apr 2020
How to Teach DNNs to Pay Attention to the Visual Modality in Speech
  Recognition
How to Teach DNNs to Pay Attention to the Visual Modality in Speech Recognition
George Sterpu
Christian Saam
N. Harte
34
28
0
17 Apr 2020
Minimum Latency Training Strategies for Streaming Sequence-to-Sequence
  ASR
Minimum Latency Training Strategies for Streaming Sequence-to-Sequence ASR
H. Inaguma
Yashesh Gaur
Liang Lu
Jinyu Li
Y. Gong
AI4TS
25
46
0
10 Apr 2020
Hybrid Autoregressive Transducer (hat)
Hybrid Autoregressive Transducer (hat)
Ehsan Variani
David Rybach
Cyril Allauzen
Michael Riley
18
158
0
12 Mar 2020
A Density Ratio Approach to Language Model Fusion in End-To-End
  Automatic Speech Recognition
A Density Ratio Approach to Language Model Fusion in End-To-End Automatic Speech Recognition
Erik McDermott
Hasim Sak
Ehsan Variani
17
112
0
26 Feb 2020
Distributed Training of Deep Neural Network Acoustic Models for
  Automatic Speech Recognition
Distributed Training of Deep Neural Network Acoustic Models for Automatic Speech Recognition
Xiaodong Cui
Wei Zhang
Ulrich Finkler
G. Saon
M. Picheny
David S. Kung
22
19
0
24 Feb 2020
End-to-End Neural Diarization: Reformulating Speaker Diarization as
  Simple Multi-label Classification
End-to-End Neural Diarization: Reformulating Speaker Diarization as Simple Multi-label Classification
Yusuke Fujita
Shinji Watanabe
Shota Horiguchi
Yawen Xue
Kenji Nagamatsu
12
49
0
24 Feb 2020
Imputer: Sequence Modelling via Imputation and Dynamic Programming
Imputer: Sequence Modelling via Imputation and Dynamic Programming
William Chan
Chitwan Saharia
Geoffrey E. Hinton
Mohammad Norouzi
Navdeep Jaitly
BDL
AI4TS
21
114
0
20 Feb 2020
Small energy masking for improved neural network training for end-to-end
  speech recognition
Small energy masking for improved neural network training for end-to-end speech recognition
Chanwoo Kim
Kwangyoun Kim
S. Indurthi
14
8
0
15 Feb 2020
Accelerating RNN Transducer Inference via One-Step Constrained Beam
  Search
Accelerating RNN Transducer Inference via One-Step Constrained Beam Search
Juntae Kim
Yoonhan Lee
13
21
0
10 Feb 2020
Audio-Visual Decision Fusion for WFST-based and seq2seq Models
Audio-Visual Decision Fusion for WFST-based and seq2seq Models
R. Aralikatti
Sharad Roy
Abhinav Thanda
D. Margam
Pujitha Appan Kandala
Tanay Sharma
S. Venkatesan
19
1
0
29 Jan 2020
Single headed attention based sequence-to-sequence model for
  state-of-the-art results on Switchboard
Single headed attention based sequence-to-sequence model for state-of-the-art results on Switchboard
Zoltán Tüske
G. Saon
Kartik Audhkhasi
Brian Kingsbury
BDL
23
68
0
20 Jan 2020
Character-Aware Attention-Based End-to-End Speech Recognition
Character-Aware Attention-Based End-to-End Speech Recognition
Zhong Meng
Yashesh Gaur
Jinyu Li
Y. Gong
15
10
0
06 Jan 2020
End-to-end training of time domain audio separation and recognition
End-to-end training of time domain audio separation and recognition
Thilo von Neumann
K. Kinoshita
Lukas Drude
Christoph Boeddeker
Marc Delcroix
Tomohiro Nakatani
Reinhold Haeb-Umbach
17
34
0
18 Dec 2019
End-to-end ASR: from Supervised to Semi-Supervised Learning with Modern
  Architectures
End-to-end ASR: from Supervised to Semi-Supervised Learning with Modern Architectures
Gabriel Synnaeve
Qiantong Xu
Jacob Kahn
Tatiana Likhomanenko
Edouard Grave
Vineel Pratap
Anuroop Sriram
Vitaliy Liptchinsky
R. Collobert
SSL
AI4TS
8
246
0
19 Nov 2019
Deep Spiking Neural Networks for Large Vocabulary Automatic Speech
  Recognition
Deep Spiking Neural Networks for Large Vocabulary Automatic Speech Recognition
Jibin Wu
Emre Yilmaz
Malu Zhang
Haizhou Li
Kay Chen Tan
25
104
0
19 Nov 2019
Transformer-based Cascaded Multimodal Speech Translation
Transformer-based Cascaded Multimodal Speech Translation
Zixiu "Alex" Wu
Ozan Caglayan
Julia Ive
Josiah Wang
Lucia Specia
25
7
0
29 Oct 2019
Transformer-Transducer: End-to-End Speech Recognition with
  Self-Attention
Transformer-Transducer: End-to-End Speech Recognition with Self-Attention
Ching-Feng Yeh
Jay Mahadeokar
Kaustubh Kalgaonkar
Yongqiang Wang
Duc Le
Mahaveer Jain
Kjell Schubert
Christian Fuegen
M. Seltzer
18
147
0
28 Oct 2019
Towards Online End-to-end Transformer Automatic Speech Recognition
Towards Online End-to-end Transformer Automatic Speech Recognition
E. Tsunoo
Yosuke Kashiwagi
Toshiyuki Kumakura
Shinji Watanabe
22
32
0
25 Oct 2019
Recognizing long-form speech using streaming end-to-end models
Recognizing long-form speech using streaming end-to-end models
A. Narayanan
Rohit Prabhavalkar
Chung-Cheng Chiu
David Rybach
Tara N. Sainath
Trevor Strohman
16
129
0
24 Oct 2019
Correction of Automatic Speech Recognition with Transformer
  Sequence-to-sequence Model
Correction of Automatic Speech Recognition with Transformer Sequence-to-sequence Model
Oleksii Hrinchuk
Mariya Popova
Boris Ginsburg
VLM
12
87
0
23 Oct 2019
A Transformer with Interleaved Self-attention and Convolution for Hybrid
  Acoustic Models
A Transformer with Interleaved Self-attention and Convolution for Hybrid Acoustic Models
Liang Lu
11
4
0
23 Oct 2019
G2G: TTS-Driven Pronunciation Learning for Graphemic Hybrid ASR
G2G: TTS-Driven Pronunciation Learning for Graphemic Hybrid ASR
Duc Le
T. Koehler
Christian Fuegen
M. Seltzer
22
16
0
22 Oct 2019
Improving Transformer-based Speech Recognition Using Unsupervised
  Pre-training
Improving Transformer-based Speech Recognition Using Unsupervised Pre-training
Dongwei Jiang
Xiaoning Lei
Wubo Li
Ne Luo
Yuxuan Hu
Wei Zou
Xiangang Li
24
99
0
22 Oct 2019
Discriminative Neural Clustering for Speaker Diarisation
Discriminative Neural Clustering for Speaker Diarisation
Qiujia Li
Florian Kreyssig
Chao Zhang
P. Woodland
11
44
0
22 Oct 2019
Transformer ASR with Contextual Block Processing
Transformer ASR with Contextual Block Processing
E. Tsunoo
Yosuke Kashiwagi
Toshiyuki Kumakura
Shinji Watanabe
51
64
0
16 Oct 2019
The Theory behind Controllable Expressive Speech Synthesis: a
  Cross-disciplinary Approach
The Theory behind Controllable Expressive Speech Synthesis: a Cross-disciplinary Approach
Noé Tits
Kevin El Haddad
Thierry Dutoit
13
8
0
14 Oct 2019
One-To-Many Multilingual End-to-end Speech Translation
One-To-Many Multilingual End-to-end Speech Translation
Mattia Antonino Di Gangi
Matteo Negri
Marco Turchi
25
50
0
08 Oct 2019
Multilingual End-to-End Speech Translation
Multilingual End-to-End Speech Translation
H. Inaguma
Kevin Duh
Tatsuya Kawahara
Shinji Watanabe
LRM
17
86
0
01 Oct 2019
Improving RNN Transducer Modeling for End-to-End Speech Recognition
Improving RNN Transducer Modeling for End-to-End Speech Recognition
Jinyu Li
Rui Zhao
Hu Hu
Y. Gong
8
170
0
26 Sep 2019
Optimizing Speech Recognition For The Edge
Optimizing Speech Recognition For The Edge
Yuan Shangguan
Jian Li
Qiao Liang
R. Álvarez
Ian McGraw
20
64
0
26 Sep 2019
Automatic Lyrics Alignment and Transcription in Polyphonic Music: Does
  Background Music Help?
Automatic Lyrics Alignment and Transcription in Polyphonic Music: Does Background Music Help?
Chitralekha Gupta
Emre Yilmaz
Haizhou Li
13
14
0
23 Sep 2019
Acoustic scene analysis with multi-head attention networks
Acoustic scene analysis with multi-head attention networks
Weimin Wang
Weiran Wang
Ming Sun
Chao Wang
14
3
0
16 Sep 2019
An Investigation Into On-device Personalization of End-to-end Automatic
  Speech Recognition Models
An Investigation Into On-device Personalization of End-to-end Automatic Speech Recognition Models
K. Sim
P. Zadrazil
F. Beaufays
18
58
0
14 Sep 2019
Metric-Based Few-Shot Learning for Video Action Recognition
Metric-Based Few-Shot Learning for Video Action Recognition
Chris Careaga
Brian Hutchinson
Nathan Oken Hodas
Lawrence Phillips
14
22
0
14 Sep 2019
NeMo: a toolkit for building AI applications using Neural Modules
NeMo: a toolkit for building AI applications using Neural Modules
Oleksii Kuchaiev
Jason Chun Lok Li
Huyen Nguyen
Oleksii Hrinchuk
Ryan Leary
...
Jack Cook
P. Castonguay
Mariya Popova
Jocelyn Huang
Jonathan M. Cohen
199
291
0
14 Sep 2019
A Comparative Study on Transformer vs RNN in Speech Applications
A Comparative Study on Transformer vs RNN in Speech Applications
Shigeki Karita
Nanxin Chen
Tomoki Hayashi
Takaaki Hori
H. Inaguma
...
Ryuichi Yamamoto
Xiao-fei Wang
Shinji Watanabe
Takenori Yoshimura
Wangyou Zhang
23
716
0
13 Sep 2019
Neural Cognitive Diagnosis for Intelligent Education Systems
Neural Cognitive Diagnosis for Intelligent Education Systems
Fei-Yue Wang
Qi Liu
Enhong Chen
Zhenya Huang
Yuying Chen
Yu Yin
Zai Huang
Shijin Wang
AI4Ed
13
225
0
23 Aug 2019
Survey on Deep Neural Networks in Speech and Vision Systems
Survey on Deep Neural Networks in Speech and Vision Systems
M. Alam
Manar D. Samad
Lasitha Vidyaratne
Alexander M. Glandon
Khan M. Iftekharuddin
3DV
VLM
AI4TS
31
205
0
16 Aug 2019
Investigating Target Set Reduction for End-to-End Speech Recognition of
  Hindi-English Code-Switching Data
Investigating Target Set Reduction for End-to-End Speech Recognition of Hindi-English Code-Switching Data
Kunal Dhawan
Ganji Sreeram
Kumar Priyadarshi
R. Sinha
8
4
0
15 Jul 2019
Learn Spelling from Teachers: Transferring Knowledge from Language
  Models to Sequence-to-Sequence Speech Recognition
Learn Spelling from Teachers: Transferring Knowledge from Language Models to Sequence-to-Sequence Speech Recognition
Ye Bai
Jiangyan Yi
J. Tao
Zhengkun Tian
Zhengqi Wen
KELM
11
38
0
13 Jul 2019
Analyzing Phonetic and Graphemic Representations in End-to-End Automatic
  Speech Recognition
Analyzing Phonetic and Graphemic Representations in End-to-End Automatic Speech Recognition
Yonatan Belinkov
Ahmed M. Ali
James R. Glass
17
32
0
09 Jul 2019
Listen, Attend, Spell and Adapt: Speaker Adapted Sequence-to-Sequence
  ASR
Listen, Attend, Spell and Adapt: Speaker Adapted Sequence-to-Sequence ASR
F. Weninger
Jesús Andrés-Ferrer
Xinwei Li
P. Zhan
AI4TS
26
26
0
08 Jul 2019
Polyphone Disambiguation for Mandarin Chinese Using Conditional Neural
  Network with Multi-level Embedding Features
Polyphone Disambiguation for Mandarin Chinese Using Conditional Neural Network with Multi-level Embedding Features
Zexin Cai
Yaogen Yang
Chuxiong Zhang
Xiaoyi Qin
Ming Li
16
26
0
03 Jul 2019
BERTphone: Phonetically-Aware Encoder Representations for
  Utterance-Level Speaker and Language Recognition
BERTphone: Phonetically-Aware Encoder Representations for Utterance-Level Speaker and Language Recognition
Shaoshi Ling
Julian Salazar
Yuzong Liu
Katrin Kirchhoff
SSL
22
27
0
30 Jun 2019
Non-Parallel Sequence-to-Sequence Voice Conversion with Disentangled
  Linguistic and Speaker Representations
Non-Parallel Sequence-to-Sequence Voice Conversion with Disentangled Linguistic and Speaker Representations
Jing-Xuan Zhang
Zhenhua Ling
Lirong Dai
22
99
0
25 Jun 2019
Unsupervised Phoneme and Word Discovery from Multiple Speakers using
  Double Articulation Analyzer and Neural Network with Parametric Bias
Unsupervised Phoneme and Word Discovery from Multiple Speakers using Double Articulation Analyzer and Neural Network with Parametric Bias
Ryo Nakashima
Ryo Ozaki
T. Taniguchi
11
6
0
21 Jun 2019
Previous
123...101189
Next