v1v2 (latest)

Listen, Attend and Spell

IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2015

5 August 2015

Papers citing "Listen, Attend and Spell"

50 / 1,064 papers shown

Fewer Hallucinations, More Verification: A Three-Stage LLM-Based Framework for ASR Error Correction

320

24 Dec 2025

WST: Weakly Supervised Transducer for Automatic Speech Recognition

160

06 Nov 2025

A Neural Model for Contextual Biasing Score Learning and Filtering

Wanting Huang

Weiran Wang

106

27 Oct 2025

StutterZero and StutterFormer: End-to-End Speech Conversion for Stuttering Transcription and Correction

Qianheng Xu

142

21 Oct 2025

Proprioceptive Image: An Image Representation of Proprioceptive Data from Quadruped Robots for Contact Estimation Learning

121

16 Oct 2025

End-to-end Speech Recognition with similar length speech and text

Peng Fan

Wenping Wang

Fei Deng

100

12 Oct 2025

Automatic Speech Recognition in the Modern Era: Architectures, Training, and Evaluation

125

11 Oct 2025

Adapting Diarization-Conditioned Whisper for End-to-End Multi-Talker Speech Recognition

Jan ''Honza'' Cernocký

117

04 Oct 2025

Building Tailored Speech Recognizers for Japanese Speaking Assessment

25 Sep 2025

WolBanking77: Wolof Banking Speech Intent Classification Dataset

215

23 Sep 2025

UMA-Split: unimodal aggregation for both English and Mandarin non-autoregressive speech recognition

Ying Fang

Xiaofei Li

115

18 Sep 2025

Whisper Has an Internal Word Aligner

Sung-Lin Yeh

Yen Meng

Hao Tang

120

12 Sep 2025

Streaming Sequence-to-Sequence Learning with Delayed Streams Modeling

227

10 Sep 2025

Enhancing the Robustness of Contextual ASR to Varying Biasing Information Volumes Through Purified Semantic Correlation Joint ModelingIEEE Transactions on Audio, Speech, and Language Processing (TASLP), 2025

110

07 Sep 2025

Serialized Output Prompting for Large Language Model-based Multi-Talker Speech Recognition

102

01 Sep 2025

H-PRM: A Pluggable Hotword Pre-Retrieval Module for Various Speech Recognition Systems

121

22 Aug 2025

A Comparative Analysis on ASR System Combination for Attention, CTC, Factored Hybrid, and Transducer Models

13 Aug 2025

TurboBias: Universal ASR Context-Biasing powered by GPU-accelerated Phrase-Boosting Tree

177

09 Aug 2025

Efficient Scaling for LLM-based ASR

193

06 Aug 2025

Triple X: A LLM-Based Multilingual Speech Recognition System for the INTERSPEECH2025 MLC-SLM Challenge

162

23 Jul 2025

Supporting SENCOTEN Language Documentation Efforts with Automatic Speech Recognition

143

14 Jul 2025

Mixture of LoRA Experts with Multi-Modal and Multi-Granularity LLM Generative Error Correction for Accented Speech RecognitionIEEE Transactions on Audio, Speech, and Language Processing (TASLP), 2025

297

12 Jul 2025

Speaker-Distinguishable CTC: Learning Speaker Distinction Using CTC for Multi-Talker Speech Recognition

119

09 Jun 2025

WCTC-Biasing: Retraining-free Contextual Biasing ASR with Wildcard CTC-based Keyword Spotting and Inter-layer Biasing

Yu Nakagome

Michael Hentschel

232

02 Jun 2025

PMF-CEC: Phoneme-augmented Multimodal Fusion for Context-aware ASR Error Correction with Error-specific Selective DecodingIEEE Transactions on Audio, Speech, and Language Processing (TASLP), 2025

Jiajun He

Tomoki Toda

148

31 May 2025

Contextualized Automatic Speech Recognition with Dynamic Vocabulary Prediction and Activation

214

29 May 2025

Cross-modal Knowledge Transfer Learning as Graph Matching Based on Optimal Transport for ASR

317

19 May 2025

RNN-Transducer-based Losses for Speech Recognition on Noisy Targets

Vladimir Bataev

382

09 Apr 2025

A 71.2-

μ

W Speech Recognition Accelerator with Recurrent Spiking Neural NetworkIEEE Transactions on Circuits and Systems Part 1: Regular Papers (TCAS-I), 2024

Chih-Chyau Yang

Tian-Sheuan Chang

358

27 Mar 2025

Dolphin: A Large-Scale Automatic Speech Recognition Model for Eastern Languages

306

26 Mar 2025

Improving Speech Recognition Accuracy Using Custom Language Models with the Vosk Toolkit

Aniket Abhishek Soni

145

26 Mar 2025

Evaluating ASR Confidence Scores for Automated Error Detection in User-Assisted Correction Interfaces

Korbinian Kuhn

Verena Kersken

Gottfried Zimmermann

274

19 Mar 2025

Automatic Speech Recognition for Non-Native English: Accuracy and Disfluency Handling

Michael McGuire

212

10 Mar 2025

Training and Inference Efficiency of Encoder-Decoder Speech Models

325

07 Mar 2025

Self-Supervised Models for Phoneme Recognition: Applications in Children's Speech for Reading LearningInterspeech (Interspeech), 2024

233

06 Mar 2025

Improving Streaming Speech Recognition With Time-Shifted Contextual Attention And Dynamic Right Context MaskingInterspeech (Interspeech), 2024

Khanh Le

Duc Thanh Chau

AI4TS

287

24 Feb 2025

Retrieval-Augmented Speech Recognition Approach for Domain Challenges

262

24 Feb 2025

Low-Rank and Sparse Model Merging for Multi-Lingual Speech Recognition and Translation

1.0K

24 Feb 2025

Note-Level Singing Melody Transcription for Time-Aligned Musical Score GenerationIEEE Transactions on Audio, Speech, and Language Processing (TASLP), 2025

308

18 Feb 2025

A Differentiable Alignment Framework for Sequence-to-Sequence Modeling via Optimal Transport

420

03 Feb 2025

HadamRNN: Binary and Sparse Ternary Orthogonal RNNsInternational Conference on Learning Representations (ICLR), 2025

Armand Foucault

Franck Mamalet

François Malgouyres

914

28 Jan 2025

Variational Bayesian Adaptive Learning of Deep Latent Variables for Acoustic Knowledge TransferIEEE Transactions on Audio, Speech, and Language Processing (TASLP), 2025

Hu Hu

Sabato Marco Siniscalchi

Chao-Han Huck Yang

Chin-Hui Lee

270

28 Jan 2025

FireRedASR: Open-Source Industrial-Grade Mandarin Speech Recognition Models from Encoder-Decoder to LLM Integration

411

24 Jan 2025

Harnessing the Zero-Shot Power of Instruction-Tuned Large Language Model in End-to-End Speech Recognition

243

08 Jan 2025

Breaking Through the Spike: Spike Window Decoding for Accelerated and Precise Automatic Speech RecognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2025

124

08 Jan 2025

Prepending or Cross-Attention for Speech-to-Text? An Empirical ComparisonNorth American Chapter of the Association for Computational Linguistics (NAACL), 2025

433

04 Jan 2025

Automatic Text Pronunciation Correlation Generation and Application for Contextual BiasingIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2025

03 Jan 2025

Speech-Based Depression Prediction Using Encoder-Weight-Only Transfer Learning and a Large CorpusIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021

342

22 Dec 2024

LAMA-UT: Language Agnostic Multilingual ASR through Orthography Unification and Language-Specific TransliterationAAAI Conference on Artificial Intelligence (AAAI), 2024

Sangmin Lee

Woo-Jin Chung Hong-Goo Kang

Hong-Goo Kang

471

19 Dec 2024

Complexity boosted adaptive training for better low resource ASR performance

Hongxuan Lu

Shenjian Wang

Biao Li

273

01 Dec 2024