v1v2 (latest)

Listen, Attend and Spell

IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2015

5 August 2015

Papers citing "Listen, Attend and Spell"

50 / 1,064 papers shown

Advanced Long-Content Speech Recognition With Factorized Neural Transducer

Xie Chen

228

20 Mar 2024

Skipformer: A Skip-and-Recover Strategy for Efficient Speech RecognitionIEEE International Conference on Multimedia and Expo (ICME), 2024

246

13 Mar 2024

The evaluation of a code-switched Sepedi-English automatic speech recognition system

Amanda Phaladi

T. Modipa

163

11 Mar 2024

A&B BNN: Add&Bit-Operation-Only Hardware-Friendly Binary Neural Network

251

06 Mar 2024

Towards Accurate Lip-to-Speech Synthesis in-the-Wild

Sindhu B. Hegde

Rudrabha Mukhopadhyay

C. V. Jawahar

Vinay P. Namboodiri

192

02 Mar 2024

Post-decoder Biasing for End-to-End Speech Recognition of Multi-turn Medical Interview

Heyang Liu

Yu Wang

Yanfeng Wang

278

01 Mar 2024

Extreme Encoder Output Frame Rate Reduction: Improving Computational Latencies of Large End-to-End Models

201

27 Feb 2024

Alternating Weak Triphone/BPE Alignment Supervision from Hybrid Model Improves End-to-End ASR

287

23 Feb 2024

How do Hyenas deal with Human Speech? Speech Recognition and Translation with ConfHyena

231

20 Feb 2024

Comparison of Conventional Hybrid and CTC/Attention Decoders for Continuous Visual Speech Recognition

David Gimeno-Gómez

Carlos David Martínez Hinarejos

201

20 Feb 2024

An Embarrassingly Simple Approach for LLM with Strong ASR Capacity

Yifan Yang

...

Qian Chen

Siqi Zheng

Shiliang Zhang

Xie Chen

AuLLM

171

102

13 Feb 2024

Self-consistent context aware conformer transducer for speech recognition

Konstantin Kolokolov

Pavel Pekichev

Karthik Raghunathan

171

09 Feb 2024

Shortcuts Everywhere and Nowhere: Exploring Multi-Trigger Backdoor AttacksIEEE Transactions on Dependable and Secure Computing (IEEE TDSC), 2024

308

27 Jan 2024

Contextualized Automatic Speech Recognition with Attention-Based Bias Phrase Boosted Beam Search

Shinji Watanabe

221

19 Jan 2024

Improving ASR Contextual Biasing with Guided AttentionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024

Kwangyoun Kim

Shinji Watanabe

186

16 Jan 2024

LCB-net: Long-Context Biasing for Audio-Visual Speech RecognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024

213

12 Jan 2024

Cross-Speaker Encoding Network for Multi-Talker Speech RecognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024

Lingwei Meng

163

08 Jan 2024

A unified multichannel far-field speech recognition system: combining neural beamforming with attention based end-to-end model

104

05 Jan 2024

CTC Blank Triggered Dynamic Layer-Skipping for Efficient CTC-based Speech RecognitionAutomatic Speech Recognition & Understanding (ASRU), 2023

160

04 Jan 2024

BLSTM-Based Confidence Estimation for End-to-End Speech Recognition

330

22 Dec 2023

Speaker Mask Transformer for Multi-talker Overlapped Speech Recognition

Peng Shen

Xugang Lu

Hisashi Kawai

181

18 Dec 2023

Conformer-Based Speech Recognition On Extreme Edge-Computing DevicesNorth American Chapter of the Association for Computational Linguistics (NAACL), 2023

...

Mahesh Krishnamoorthy

238

16 Dec 2023

USM-Lite: Quantization and Sparsity Aware Fine-tuning for Speech Recognition with Universal Speech ModelsIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

...

472

13 Dec 2023

D4AM: A General Denoising Framework for Downstream Acoustic ModelsInternational Conference on Learning Representations (ICLR), 2023

H. Wang

Yu Tsao

Hsin-Min Wang

Chu-Song Chen

175

28 Nov 2023

Weak Alignment Supervision from Hybrid Model Improves End-to-end ASR

Jintao Jiang

Yingbo Gao

Zoltán Tüske

335

24 Nov 2023

Analysis of Visual Features for Continuous Lipreading in SpanishIberSPEECH Conference (IberSPEECH), 2021

David Gimeno-Gómez

Carlos David Martínez Hinarejos

218

21 Nov 2023

LIP-RTVE: An Audiovisual Database for Continuous Spanish in the WildInternational Conference on Language Resources and Evaluation (LREC), 2023

David Gimeno-Gómez

Carlos David Martínez Hinarejos

232

21 Nov 2023

Phonological Level wav2vec2-based Mispronunciation Detection and Diagnosis Method

M. Shahin

Julien Epps

Beena Ahmed

122

13 Nov 2023

End-to-End Single-Channel Speaker-Turn Aware Conversational Speech TranslationConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

235

01 Nov 2023

Distil-Whisper: Robust Knowledge Distillation via Large-Scale Pseudo Labelling

340

104

01 Nov 2023

MixRep: Hidden Representation Mixup for Low-Resource Speech RecognitionInterspeech (Interspeech), 2023

Jiamin Xie

John H. L. Hansen

144

27 Oct 2023

Key Frame Mechanism For Efficient Conformer Based End-to-end Speech RecognitionIEEE Signal Processing Letters (IEEE SPL), 2023

239

23 Oct 2023

Tailoring Adversarial Attacks on Deep Neural Networks for Targeted Class Manipulation Using DeepFool AlgorithmScientific Reports (Sci Rep), 2023

S. M. Fazle

J. Mondal

Meem Arafat Manab

Xi Xiao

Sarfaraz Newaz

AAML

455

18 Oct 2023

End-to-End real time tracking of children's reading with pointer networkIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

Vishal Sunder

Beulah Karrolla

Eric Fosler-Lussier

17 Oct 2023

Correction Focused Language Model Training for Speech RecognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

Yingyi Ma

Zhe Liu

Ozlem Kalinli

KELM

283

17 Oct 2023

Personalization of CTC-based End-to-End Speech Recognition Using Pronunciation-Driven Subword Tokenization

...

211

16 Oct 2023

Improved Contextual Recognition In Automatic Speech Recognition Systems By Semantic Lattice Rescoring

359

14 Oct 2023

On the Relevance of Phoneme Duration Variability of Synthesized Training Data for Automatic Speech RecognitionAutomatic Speech Recognition & Understanding (ASRU), 2023

Nick Rossenbach

Benedikt Hilmes

Ralf Schluter

148

12 Oct 2023

Investigating the Effect of Language Models in Sequence Discriminative Training for Neural TransducersAutomatic Speech Recognition & Understanding (ASRU), 2023

158

11 Oct 2023

Whispering LLaMA: A Cross-Modal Generative Error Correction Framework for Speech RecognitionConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

Chao-Han Huck Yang

357

10 Oct 2023

ed-cec: improving rare word recognition using asr postprocessing based on error detection and context-aware error correctionAutomatic Speech Recognition & Understanding (ASRU), 2023

Jiajun He

Zekun Yang

Tomoki Toda

190

08 Oct 2023

Spike-Triggered Contextual Biasing for End-to-End Mandarin Speech RecognitionAutomatic Speech Recognition & Understanding (ASRU), 2023

Binbin Zhang

Lei Xie

167

07 Oct 2023

Dementia Assessment Using Mandarin Speech with an Attention-based Speech Recognition EncoderIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

116

06 Oct 2023

Contextual Biasing with the Knuth-Morris-Pratt Matching AlgorithmInterspeech (Interspeech), 2023

Weiran Wang

Zelin Wu

D. Caseiro

Tsendsuren Munkhdalai

...

Ding Zhao

242

29 Sep 2023

LAE-ST-MoE: Boosted Language-Aware Encoder Using Speech Translation Auxiliary Task for E2E Code-switching ASRAutomatic Speech Recognition & Understanding (ASRU), 2023

222

28 Sep 2023

Cross-Modal Multi-Tasking for Speech-to-Text Translation via Hard Parameter SharingIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

B. Grimstad

Xuankai Chang

Antonios Anastasopoulos

Yuya Fujita

Shinji Watanabe

288

27 Sep 2023

HyPoradise: An Open Baseline for Generative Speech Recognition with Large Language ModelsNeural Information Processing Systems (NeurIPS), 2023

Cheng Chen

Yuchen Hu

Chao-Han Huck Yang

Sabato Marco Siniscalchi

Pin-Yu Chen

Eng Siong Chng

219

27 Sep 2023

Developing automatic verbatim transcripts for international multilingual meetings: an end-to-end solutionMachine Translation Summit (MT Summit), 2023

133

27 Sep 2023

Segment-Level Vectorized Beam Search Based on Partially Autoregressive InferenceAutomatic Speech Recognition & Understanding (ASRU), 2023

286

26 Sep 2023

On the Relation between Internal Language Model and Sequence Discriminative Training for Neural TransducersIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

260

25 Sep 2023