Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1508.01211
Cited By
v1
v2 (latest)
Listen, Attend and Spell
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2015
5 August 2015
William Chan
Navdeep Jaitly
Quoc V. Le
Oriol Vinyals
RALM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Listen, Attend and Spell"
50 / 1,064 papers shown
Empowering the Deaf and Hard of Hearing Community: Enhancing Video Captions Using Large Language Models
Nadeen Fathallah
Monika Bhole
Steffen Staab
366
2
0
30 Nov 2024
Towards Maximum Likelihood Training for Transducer-based Streaming Speech Recognition
IEEE Signal Processing Letters (SPL), 2024
Hyeonseung Lee
J. Yoon
Sungsoo Kim
N. Kim
300
0
0
26 Nov 2024
On the Cost of Model-Serving Frameworks: An Experimental Evaluation
Pasquale De Rosa
Yérom-David Bromberg
Pascal Felber
Djob Mvondo
V. Schiavoni
207
1
0
15 Nov 2024
emg2qwerty: A Large Dataset with Baselines for Touch Typing using Surface Electromyography
Neural Information Processing Systems (NeurIPS), 2024
Viswanath Sivakumar
Jeffrey Seely
Alan Du
Sean R Bittner
Adam Berenzweig
Anuoluwapo Bolarinwa
Alexandre Gramfort
Michael I Mandel
372
24
0
26 Oct 2024
A two-stage transliteration approach to improve performance of a multilingual ASR
Rohit Kumar
162
0
0
09 Oct 2024
The USTC-NERCSLIP Systems for the CHiME-8 MMCSG Challenge
Ya Jiang
Hongbo Lan
Jun Du
Qing Wang
Shutong Niu
309
1
0
08 Oct 2024
Multi-Dialect Vietnamese: Task, Dataset, Baseline Models and Challenges
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Nguyen Van Dinh
Thanh Chi Dang
Luan Thanh Nguyen
Kiet Van Nguyen
221
5
0
04 Oct 2024
The Conformer Encoder May Reverse the Time Dimension
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Robin Schmitt
Albert Zeyer
Mohammad Zeineldeen
Ralf Schluter
Hermann Ney
293
1
0
01 Oct 2024
Predictive Speech Recognition and End-of-Utterance Detection Towards Spoken Dialog Systems
Oswald Zink
Yosuke Higuchi
Carlos Mullov
Alexander Waibel
Tetsunori Kobayashi
133
3
0
30 Sep 2024
Speech-Mamba: Long-Context Speech Recognition with Selective State Spaces Models
Spoken Language Technology Workshop (SLT), 2024
Xiaoxue Gao
Nancy F. Chen
Mamba
206
12
0
27 Sep 2024
Improving Multilingual ASR in the Wild Using Simple N-best Re-ranking
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Brian Yan
Vineel Pratap
Shinji Watanabe
Michael Auli
255
1
0
27 Sep 2024
Exploring Information-Theoretic Metrics Associated with Neural Collapse in Supervised Training
Kun Song
Zhiquan Tan
Bochao Zou
Jiansheng Chen
Huimin Ma
Weiran Huang
384
2
0
25 Sep 2024
Target word activity detector: An approach to obtain ASR word boundaries without lexicon
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
S. Sivasankaran
Eric Sun
Jinyu Li
Yan-ping Huang
Jing Pan
146
0
0
20 Sep 2024
EMMeTT: Efficient Multimodal Machine Translation Training
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Piotr Żelasko
Zhehuai Chen
Mengru Wang
Daniel Galvez
Oleksii Hrinchuk
Shuoyang Ding
Ke Hu
Jagadeesh Balam
Vitaly Lavrukhin
Boris Ginsburg
176
4
0
20 Sep 2024
AutoMode-ASR: Learning to Select ASR Systems for Better Quality and Cost
International Conference on Speech and Computer (SPECOM), 2024
Ahmet Gündüz
Yunsu Kim
Kamer Ali Yuksel
Mohamed Al-Badrashiny
Thiago Castro Ferreira
Hassan Sawaf
191
0
0
19 Sep 2024
Disentangling Speakers in Multi-Talker Speech Recognition with Speaker-Aware CTC
Jiawen Kang
Lingwei Meng
Mingyu Cui
Yuejiao Wang
Xixin Wu
Xunying Liu
Helen Meng
274
6
0
19 Sep 2024
A Joint Spectro-Temporal Relational Thinking Based Acoustic Modeling Framework
Zheng Nan
T. Dang
V. Sethu
Beena Ahmed
150
0
0
17 Sep 2024
ASR Error Correction using Large Language Models
IEEE Transactions on Audio, Speech, and Language Processing (TASLP), 2024
Rao Ma
Mengjie Qian
Mark Gales
Kate Knill
KELM
313
21
0
14 Sep 2024
Findings of the 2024 Mandarin Stuttering Event Detection and Automatic Speech Recognition Challenge
Spoken Language Technology Workshop (SLT), 2024
Hongfei Xue
Rong Gong
Mingchen Shao
Xin Xu
L. xilinx Wang
...
Yong Qin
Jun Du
Ming Li
Binbin Zhang
Bin Jia
186
5
0
09 Sep 2024
Lightweight Transducer Based on Frame-Level Criterion
Interspeech (Interspeech), 2024
Genshun Wan
Mengzhi Wang
Tingzhi Mao
Hang Chen
Z. Ye
240
1
0
05 Sep 2024
Enhancing Code-Switching Speech Recognition with LID-Based Collaborative Mixture of Experts Model
Spoken Language Technology Workshop (SLT), 2024
Hukai Huang
Jiayan Lin
Kaidi Wang
Yishuang Li
Wenhao Guan
Lin Li
Q. Hong
MoE
231
3
0
03 Sep 2024
Comparative Study on Noise-Augmented Training and its Effect on Adversarial Robustness in ASR Systems
Computer Speech and Language (CSL), 2024
Karla Pizzi
Matías P. Pizarro
Asja Fischer
332
1
0
03 Sep 2024
What does it take to get state of the art in simultaneous speech-to-speech translation?
Vincent Wilmet
Johnson Du
171
0
0
02 Sep 2024
Serialized Speech Information Guidance with Overlapped Encoding Separation for Multi-Speaker Automatic Speech Recognition
Spoken Language Technology Workshop (SLT), 2024
Hao Shi
Yuan Gao
Zhaoheng Ni
Tatsuya Kawahara
433
5
0
01 Sep 2024
The State of Commercial Automatic French Legal Speech Recognition Systems and their Impact on Court Reporters et al
Nicolad Garneau
Olivier Bolduc
ELM
AILaw
165
1
0
21 Aug 2024
Survey: Transformer-based Models in Data Modality Conversion
Elyas Rashno
Amir Eskandari
Aman Anand
F. Zulkernine
MedIm
225
6
0
08 Aug 2024
On the Problem of Text-To-Speech Model Selection for Synthetic Data Generation in Automatic Speech Recognition
Nick Rossenbach
Ralf Schluter
S. Sakti
179
4
0
31 Jul 2024
On the Effect of Purely Synthetic Training Data for Different Automatic Speech Recognition Architectures
Nick Rossenbach
Benedikt Hilmes
Ralf Schluter
223
9
0
25 Jul 2024
CUSIDE-T: Chunking, Simulating Future and Decoding for Transducer based Streaming ASR
Wenbo Zhao
Ziwei Li
Chuan Yu
Zhijian Ou
AI4TS
257
3
0
14 Jul 2024
Seed-ASR: Understanding Diverse Speech and Contexts with LLM-based Speech Recognition
Ye Bai
Jingping Chen
Jitong Chen
Wei Chen
Zhuo Chen
...
Wanyi Zhang
Yang Zhang
Yawei Zhang
Yijie Zheng
Ming Zou
AuLLM
364
69
0
05 Jul 2024
Serialized Output Training by Learned Dominance
Ying Shi
Lantian Li
Shi Yin
D. Wang
Jiqing Han
142
7
0
04 Jul 2024
BESTOW: Efficient and Streamable Speech Language Model with the Best of Two Worlds in GPT and T5
Zhehuai Chen
He Huang
Oleksii Hrinchuk
Krishna Puvvada
Nithin Rao Koluguri
Piotr Żelasko
Jagadeesh Balam
Boris Ginsburg
AuLLM
RALM
249
18
0
28 Jun 2024
MSR-86K: An Evolving, Multilingual Corpus with 86,300 Hours of Transcribed Audio for Speech Recognition Research
Song Li
Yongbin You
Xuezhi Wang
Zhengkun Tian
Ke Ding
Guanglu Wan
200
11
0
26 Jun 2024
Token-Weighted RNN-T for Learning from Flawed Data
Gil Keren
Wei Zhou
Ozlem Kalinli
264
1
0
26 Jun 2024
Automatic speech recognition for the Nepali language using CNN, bidirectional LSTM and ResNet
Manish Dhakal
Arman Chhetri
Aman Kumar Gupta
Prabin B. Lamichhane
S. Pandey
S. Shakya
AI4TS
190
11
0
25 Jun 2024
InterBiasing: Boost Unseen Word Recognition through Biasing Intermediate Predictions
Yu Nakagome
Michael Hentschel
206
4
0
21 Jun 2024
Instruction Data Generation and Unsupervised Adaptation for Speech Language Models
Vahid Noroozi
Zhehuai Chen
Somshubra Majumdar
Steve Huang
Jagadeesh Balam
Boris Ginsburg
SyDa
341
5
0
18 Jun 2024
Guiding Frame-Level CTC Alignments Using Self-knowledge Distillation
Eungbeom Kim
Hantae Kim
Kyogu Lee
185
2
0
12 Jun 2024
Dual-Pipeline with Low-Rank Adaptation for New Language Integration in Multilingual ASR
Yerbolat Khassanov
Zhipeng Chen
Tianfeng Chen
Tze Yuang Chong
Wei Li
Jun Zhang
Lu Lu
Yuxuan Wang
AI4CE
173
2
0
12 Jun 2024
StreamAtt: Direct Streaming Speech-to-Text Translation with Attention-based Audio History Selection
Annual Meeting of the Association for Computational Linguistics (ACL), 2024
Sara Papi
Marco Gaido
Matteo Negri
L. Bentivogli
386
16
0
10 Jun 2024
LoRA-Whisper: Parameter-Efficient and Extensible Multilingual ASR
Interspeech (Interspeech), 2024
Zheshu Song
Jianheng Zhuo
Yifan Yang
Ziyang Ma
Shixiong Zhang
Xie Chen
208
32
0
07 Jun 2024
Unveiling the Dynamics of Information Interplay in Supervised Learning
Kun Song
Zhiquan Tan
Bochao Zou
Huimin Ma
Weiran Huang
227
3
0
06 Jun 2024
Joint Beam Search Integrating CTC, Attention, and Transducer Decoders
Yui Sudo
Muhammad Shakeel
Yosuke Fukumoto
Brian Yan
Jiatong Shi
Yifan Peng
Shinji Watanabe
243
2
0
05 Jun 2024
Joint Optimization of Streaming and Non-Streaming Automatic Speech Recognition with Multi-Decoder and Knowledge Distillation
Muhammad Shakeel
Yui Sudo
Yifan Peng
Shinji Watanabe
232
0
0
22 May 2024
Contextualized Automatic Speech Recognition with Dynamic Vocabulary
Yui Sudo
Yosuke Fukumoto
Muhammad Shakeel
Yifan Peng
Shinji Watanabe
290
8
0
22 May 2024
Gated Low-rank Adaptation for personalized Code-Switching Automatic Speech Recognition on the low-spec devices
Gwantae Kim
Bokyeung Lee
Donghyeon Kim
Hanseok Ko
OffRL
180
2
0
24 Apr 2024
Transducers with Pronunciation-aware Embeddings for Automatic Speech Recognition
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Hainan Xu
Zhehuai Chen
Fei Jia
Boris Ginsburg
167
0
0
04 Apr 2024
Effective internal language model training and fusion for factorized transducer model
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Jinxi Guo
Niko Moritz
Yingyi Ma
Frank Seide
Chunyang Wu
Jay Mahadeokar
Ozlem Kalinli
Christian Fuegen
Michael Seltzer
195
4
0
02 Apr 2024
Enhancing Efficiency in Vision Transformer Networks: Design Techniques and Insights
Moein Heidari
Reza Azad
Sina Ghorbani Kolahi
René Arimond
Leon Niggemeier
...
Afshin Bozorgpour
Ehsan Khodapanah Aghdam
Amirhossein Kazerouni
Ilker Hacihaliloglu
Dorit Merhof
303
14
0
28 Mar 2024
M
3
^3
3
AV: A Multimodal, Multigenre, and Multipurpose Audio-Visual Academic Lecture Dataset
Zhe Chen
Heyang Liu
Wenyi Yu
Guangzhi Sun
Hongcheng Liu
Ji Wu
Chao Zhang
Yu Wang
Yanfeng Wang
VGen
175
3
0
21 Mar 2024
Previous
1
2
3
4
5
...
20
21
22
Next