Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1508.01211
Cited By
v1
v2 (latest)
Listen, Attend and Spell
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2015
5 August 2015
William Chan
Navdeep Jaitly
Quoc V. Le
Oriol Vinyals
RALM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Listen, Attend and Spell"
50 / 1,064 papers shown
SpeeChain: A Speech Toolkit for Large-Scale Machine Speech Chain
Heli Qi
Sashi Novitasari
Andros Tjandra
S. Sakti
Satoshi Nakamura
188
3
0
08 Jan 2023
Object Segmentation with Audio Context
Kaihui Zheng
Yuqing Ren
Zixin Shen
Tianxu Qin
VOS
189
0
0
04 Jan 2023
Sample-Efficient Unsupervised Domain Adaptation of Speech Recognition Systems A case study for Modern Greek
IEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2022
Georgios Paraskevopoulos
Theodoros Kouzelis
Georgios Rouvalis
Athanasios Katsamanis
Vassilis Katsouros
Alexandros Potamianos
VLM
280
13
0
31 Dec 2022
4D ASR: Joint modeling of CTC, Attention, Transducer, and Mask-Predict decoders
Interspeech (Interspeech), 2022
Yui Sudo
Muhammad Shakeel
Brian Yan
Jiatong Shi
Shinji Watanabe
116
13
0
21 Dec 2022
Attention as a Guide for Simultaneous Speech Translation
Annual Meeting of the Association for Computational Linguistics (ACL), 2022
Sara Papi
Matteo Negri
Marco Turchi
208
39
0
15 Dec 2022
GAMMA: Generative Augmentation for Attentive Marine Debris Detection
Vaishnavi Khindkar
Janhavi Khindkar
ViT
100
1
0
07 Dec 2022
Lattice-Free Sequence Discriminative Training for Phoneme-Based Neural Transducers
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Zijian Yang
Wei Zhou
Ralf Schluter
Hermann Ney
228
4
0
07 Dec 2022
Learning the joint distribution of two sequences using little or no paired data
Soroosh Mariooryad
Matt Shannon
Siyuan Ma
Tom Bagby
David Kao
Daisy Stanton
Eric Battenberg
RJ Skerry-Ryan
262
3
0
06 Dec 2022
LMEC: Learnable Multiplicative Absolute Position Embedding Based Conformer for Speech Recognition
Yuguang Yang
Yu Pan
Jingjing Yin
Heng Lu
251
4
0
05 Dec 2022
Continual Learning for On-Device Speech Recognition using Disentangled Conformers
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Anuj Diwan
Ching-Feng Yeh
Wei-Ning Hsu
Paden Tomasello
Eunsol Choi
David Harwath
Abdel-rahman Mohamed
CLL
BDL
284
9
0
02 Dec 2022
Neural Transducer Training: Reduced Memory Consumption with Sample-wise Computation
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Stefan Braun
Erik McDermott
Roger Hsiao
147
1
0
29 Nov 2022
Unsupervised Model-based speaker adaptation of end-to-end lattice-free MMI model for speech recognition
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Xurong Xie
Xunying Liu
Hui Chen
Hongan Wang
242
1
0
17 Nov 2022
Continuous Soft Pseudo-Labeling in ASR
Tatiana Likhomanenko
R. Collobert
Navdeep Jaitly
Samy Bengio
VLM
272
5
0
11 Nov 2022
Massively Multilingual ASR on 70 Languages: Tokenization, Architecture, and Generalization Capabilities
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Andros Tjandra
Nayan Singhal
David C. Zhang
Ozlem Kalinli
Abdel-rahman Mohamed
Duc Le
M. Seltzer
226
20
0
10 Nov 2022
Adaptive Multi-Corpora Language Model Training for Speech Recognition
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Yingyi Ma
Zhe Liu
Xuedong Zhang
188
3
0
09 Nov 2022
Peak-First CTC: Reducing the Peak Latency of CTC Models by Applying Peak-First Regularization
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Zhengkun Tian
Hongyu Xiang
Min Li
Fei Lin
Ke Ding
Guanglu Wan
118
7
0
07 Nov 2022
Deliberation Networks and How to Train Them
Qingyun Dou
Mark Gales
115
0
0
06 Nov 2022
Multi-blank Transducers for Speech Recognition
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Hainan Xu
Fei Jia
Somshubra Majumdar
Shinji Watanabe
Boris Ginsburg
215
12
0
04 Nov 2022
Once-for-All Sequence Compression for Self-Supervised Speech Models
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Hsuan-Jui Chen
Yen Meng
Hung-yi Lee
289
6
0
04 Nov 2022
The ISCSLP 2022 Intelligent Cockpit Speech Recognition Challenge (ICSRC): Dataset, Tracks, Baseline and Results
International Symposium on Chinese Spoken Language Processing (ISCSLP), 2022
Ao Zhang
F. Yu
Kaixun Huang
Linfu Xie
Longbiao Wang
Eng Siong Chng
Hui Bu
Binbin Zhang
Wei Chen
Xin Xu
201
5
0
03 Nov 2022
Losses Can Be Blessings: Routing Self-Supervised Speech Representations Towards Efficient Multilingual and Multitask Speech Processing
Neural Information Processing Systems (NeurIPS), 2022
Yonggan Fu
Yang Zhang
Kaizhi Qian
Zhifan Ye
Zhongzhi Yu
Cheng-I Jeff Lai
Yingyan Lin
354
10
0
02 Nov 2022
Internal Language Model Estimation based Adaptive Language Model Fusion for Domain Adaptation
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Rao Ma
Xiaobo Wu
Jin Qiu
Yanan Qin
Haihua Xu
Peihao Wu
Zejun Ma
170
3
0
02 Nov 2022
Fast-U2++: Fast and Accurate End-to-End Speech Recognition in Joint CTC/Attention Frames
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Che-Yuan Liang
Xiao-Lei Zhang
BinBin Zhang
Di Wu
Shengqiang Li
Xingcheng Song
Zhendong Peng
Fuping Pan
94
11
0
02 Nov 2022
Conversation-oriented ASR with multi-look-ahead CBS architecture
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Huaibo Zhao
S. Fujie
Tetsuji Ogawa
Jin Sakuma
Yusuke Kida
Tetsunori Kobayashi
238
3
0
02 Nov 2022
InterMPL: Momentum Pseudo-Labeling with Intermediate CTC Loss
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Yosuke Higuchi
Tetsuji Ogawa
Tetsunori Kobayashi
Shinji Watanabe
284
1
0
02 Nov 2022
TrimTail: Low-Latency Streaming ASR with Simple but Effective Spectrogram-Level Length Penalty
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Xingcheng Song
Di Wu
Zhiyong Wu
Binbin Zhang
Yuekai Zhang
Zhendong Peng
Wenpeng Li
Fuping Pan
Changbao Zhu
241
12
0
01 Nov 2022
Speech-text based multi-modal training with bidirectional attention for improved speech recognition
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Yuhang Yang
Haihua Xu
Hao-Ming Huang
Eng Siong Chng
Sheng Li
178
9
0
01 Nov 2022
Joint Audio/Text Training for Transformer Rescorer of Streaming Speech Recognition
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Suyoun Kim
Ke Li
Lucas Kabela
Rongqing Huang
Jiedan Zhu
Ozlem Kalinli
Duc Le
195
8
0
31 Oct 2022
Structured State Space Decoder for Speech Recognition and Synthesis
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Koichi Miyazaki
Masato Murata
Tomoki Koriyama
258
14
0
31 Oct 2022
FusionFormer: Fusing Operations in Transformer for Efficient Streaming Speech Recognition
Xingcheng Song
Di Wu
Binbin Zhang
Zhiyong Wu
Wenpeng Li
...
Peng Zhang
Zhendong Peng
Fuping Pan
Changbao Zhu
Zhongqin Wu
133
2
0
31 Oct 2022
Modular Hybrid Autoregressive Transducer
Spoken Language Technology Workshop (SLT), 2022
Zhong Meng
Tongzhou Chen
Rohit Prabhavalkar
Yu Zhang
Gary Wang
...
Bhuvana Ramabhadran
Wenjie Huang
Ehsan Variani
Yinghui Huang
Pedro J. Moreno
188
27
0
31 Oct 2022
Blank Collapse: Compressing CTC emission for the faster decoding
Interspeech (Interspeech), 2022
Minkyu Jung
Ohhyeok Kwon
S. Seo
Soonshin Seo
237
3
0
31 Oct 2022
Partitioned Gradient Matching-based Data Subset Selection for Compute-Efficient Robust ASR Training
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Ashish R. Mittal
D. Sivasubramanian
Rishabh K. Iyer
Preethi Jyothi
Ganesh Ramakrishnan
162
4
0
30 Oct 2022
BERT Meets CTC: New Formulation of End-to-End Speech Recognition with Pre-trained Masked Language Model
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Yosuke Higuchi
Brian Yan
Siddhant Arora
Tetsuji Ogawa
Tetsunori Kobayashi
Shinji Watanabe
255
31
0
29 Oct 2022
Accelerating RNN-T Training and Inference Using CTC guidance
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Yongqiang Wang
Zhehuai Chen
Cheng-yong Zheng
Yu Zhang
Wei Han
Parisa Haghani
205
28
0
29 Oct 2022
Filter and evolve: progressive pseudo label refining for semi-supervised automatic speech recognition
Zezhong Jin
Dading Zhong
Xiao Song
Zhaoyi Liu
Naipeng Ye
Qingcheng Zeng
140
2
0
28 Oct 2022
Random Utterance Concatenation Based Data Augmentation for Improving Short-video Speech Recognition
Interspeech (Interspeech), 2022
Yist Y. Lin
Tao Han
Haihua Xu
Van Tung Pham
Yerbolat Khassanov
Tze Yuang Chong
Yi He
Lu Lu
Zejun Ma
151
3
0
28 Oct 2022
Token-level Sequence Labeling for Spoken Language Understanding using Compositional End-to-End Models
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Siddhant Arora
Siddharth Dalmia
Brian Yan
Florian Metze
A. Black
Shinji Watanabe
124
13
0
27 Oct 2022
Monotonic segmental attention for automatic speech recognition
Spoken Language Technology Workshop (SLT), 2022
Albert Zeyer
Robin Schmitt
Wei Zhou
Ralf Schluter
Hermann Ney
129
11
0
26 Oct 2022
Linguistic-Enhanced Transformer with CTC Embedding for Speech Recognition
International Conference on Mobile Ad-hoc and Sensor Networks (MSN), 2022
Xulong Zhang
Jianzong Wang
Ning Cheng
Mengyuan Zhao
Zhiyong Zhang
Jing Xiao
105
1
0
25 Oct 2022
ESB: A Benchmark For Multi-Domain End-to-End Speech Recognition
Sanchit Gandhi
Patrick von Platen
Alexander M. Rush
141
27
0
24 Oct 2022
Optimizing Bilingual Neural Transducer with Synthetic Code-switching Text Generation
Thien Nguyen
Nathalie Tran
Liuhui Deng
Thiago Fraga da Silva
Matthew Radzihovsky
...
Honza Silovsky
Arnab Ghoshal
M. Martel
Bharat Ram Ambati
Mohamed Ali
244
6
0
21 Oct 2022
Improving Semi-supervised End-to-end Automatic Speech Recognition using CycleGAN and Inter-domain Losses
Spoken Language Technology Workshop (SLT), 2022
C. Li
Ngoc Thang Vu
144
3
0
20 Oct 2022
Anchored Speech Recognition with Neural Transducers
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Desh Raj
Junteng Jia
Jay Mahadeokar
Chunyang Wu
Niko Moritz
Xiaohui Zhang
Ozlem Kalinli
237
2
0
20 Oct 2022
End-to-End Integration of Speech Recognition, Dereverberation, Beamforming, and Self-Supervised Learning Representation
Spoken Language Technology Workshop (SLT), 2022
Yoshiki Masuyama
Xuankai Chang
Samuele Cornell
Shinji Watanabe
Nobutaka Ono
243
25
0
19 Oct 2022
Helpful Neighbors: Leveraging Neighbors in Geographic Feature Pronunciation
Transactions of the Association for Computational Linguistics (TACL), 2022
Llion Jones
R. Sproat
Haruko Ishikawa
Alexander Gutkin
194
2
0
18 Oct 2022
Acoustic-aware Non-autoregressive Spell Correction with Mask Sample Decoding
Ruchao Fan
Guoli Ye
Yashesh Gaur
Jinyu Li
183
4
0
16 Oct 2022
A Policy-based Approach to the SpecAugment Method for Low Resource E2E ASR
Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 2022
Rui Li
Guodong Ma
Dexin Zhao
Ranran Zeng
Xiaoyu Li
Haolin Huang
131
5
0
16 Oct 2022
On Compressing Sequences for Self-Supervised Speech Models
Spoken Language Technology Workshop (SLT), 2022
Yen Meng
Hsuan-Jui Chen
Jiatong Shi
Shinji Watanabe
Paola García
Hung-yi Lee
Hao Tang
SSL
195
15
0
13 Oct 2022
An Experimental Study on Private Aggregation of Teacher Ensemble Learning for End-to-End Speech Recognition
Spoken Language Technology Workshop (SLT), 2022
Chao-Han Huck Yang
I-Fan Chen
A. Stolcke
Sabato Marco Siniscalchi
Chin-Hui Lee
173
3
0
11 Oct 2022
Previous
1
2
3
...
5
6
7
...
20
21
22
Next