v1v2 (latest)

Listen, Attend and Spell

IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2015

5 August 2015

Papers citing "Listen, Attend and Spell"

50 / 1,064 papers shown

Hystoc: Obtaining word confidences for fusion of end-to-end ASR systemsIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

Karel Beneš

M. Kocour

L. Burget

129

21 May 2023

VAKTA-SETU: A Speech-to-Speech Machine Translation Service in Select Indic Languages

152

21 May 2023

Multi-Head State Space Model for Speech RecognitionInterspeech (Interspeech), 2023

...

Ozlem Kalinli

160

21 May 2023

Contextualized End-to-End Speech Recognition with Contextual Phrase Prediction NetworkInterspeech (Interspeech), 2023

446

21 May 2023

Language-universal phonetic encoder for low-resource speech recognitionInterspeech (Interspeech), 2023

Yuxuan Wang

211

19 May 2023

Blank-regularized CTC for Frame Skipping in Neural TransducerInterspeech (Interspeech), 2023

Yifan Yang

Xiaoyu Yang

Liyong Guo

Zengwei Yao

Wei Kang

Fangjun Kuang

Long Lin

Xie Chen

Daniel Povey

142

19 May 2023

AlignAtt: Using Attention-based Audio-Translation Alignments as a Guide for Simultaneous Speech TranslationInterspeech (Interspeech), 2023

Sara Papi

Marco Turchi

Matteo Negri

185

19 May 2023

A Comparative Study on E-Branchformer vs Conformer in Speech Recognition, Translation, and Understanding TasksInterspeech (Interspeech), 2023

Kwangyoun Kim

236

18 May 2023

FunASR: A Fundamental End-to-End Speech Recognition ToolkitInterspeech (Interspeech), 2023

...

246

110

18 May 2023

A Lexical-aware Non-autoregressive Transformer-based ASR ModelInterspeech (Interspeech), 2023

Chong Lin

Kuan-Yu Chen

AI4TS

122

18 May 2023

Accurate and Reliable Confidence Estimation Based on Non-Autoregressive End-to-End Speech Recognition SystemInterspeech (Interspeech), 2023

151

18 May 2023

Masked Audio Text Encoders are Effective Multi-Modal RescorersAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

345

11 May 2023

Robust Acoustic and Semantic Contextual Biasing in Neural Transducers for Speech RecognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

Xuandi Fu

Kanthashree Mysore Sathyendra

Athanasios Mouchtaris

292

09 May 2023

End-to-end spoken language understanding using joint CTC loss and self-supervised, pretrained acoustic encodersIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

169

04 May 2023

Deep Transfer Learning for Automatic Speech Recognition: Towards Better GeneralizationKnowledge-Based Systems (KBS), 2023

Hamza Kheddar

298

117

27 Apr 2023

Self-regularised Minimum Latency Training for Streaming Transformer-based Speech RecognitionInterspeech (Interspeech), 2022

Mohan Li

R. Doddipatla

Catalin Zorila

228

24 Apr 2023

Non-autoregressive End-to-end Approaches for Joint Automatic Speech Recognition and Spoken Language UnderstandingSpoken Language Technology Workshop (SLT), 2023

Mohan Li

R. Doddipatla

169

21 Apr 2023

DropDim: A Regularization Method for Transformer NetworksIEEE Signal Processing Letters (IEEE SPL), 2023

190

20 Apr 2023

A Comparison of Semi-Supervised Learning Techniques for Streaming ASR at ScaleIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

127

19 Apr 2023

Dynamic Chunk Convolution for Unified Streaming and Non-Streaming Conformer ASRIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

180

18 Apr 2023

A CTC Alignment-based Non-autoregressive Transformer for End-to-end Automatic Speech RecognitionIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023

173

15 Apr 2023

Robust and Context-Aware Real-Time Collaborative Robot Handling via Dynamic Gesture CommandsIEEE Robotics and Automation Letters (RA-L), 2023

Rui Chen

Alvin C M Shek

Changliu Liu

100

12 Apr 2023

Online Spatio-Temporal Learning with Target ProjectionInternational Conference on Artificial Intelligence Circuits and Systems (ICAICS), 2023

191

11 Apr 2023

Sim-T: Simplify the Transformer Network by Multiplexing Technique for Speech Recognition

194

11 Apr 2023

Wav2code: Restore Clean Speech Representations via Codebook Lookup for Noise-Robust ASRIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023

Yuchen Hu

Cheng Chen

Qiu-shi Zhu

Eng Siong Chng

301

11 Apr 2023

Dual-Attention Neural Transducers for Efficient Wake Word Spotting in Speech RecognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

Saumya Yashmohini Sahai

Jing Liu

Thejaswi Muniyappa

Kanthashree Mysore Sathyendra

Anastasios Alexandridis

...

Ross McGowan

Ariya Rastrow

Feng-Ju Chang

Athanasios Mouchtaris

Siegfried Kunzmann

204

03 Apr 2023

Dialog act guided contextual adapter for personalized speech recognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

Feng-Ju Chang

Thejaswi Muniyappa

Kanthashree Mysore Sathyendra

Kailin Wei

Grant P. Strimel

Ross McGowan

124

31 Mar 2023

PROCTER: PROnunciation-aware ConTextual adaptER for personalized speech recognition in neural transducersIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

147

30 Mar 2023

AVFormer: Injecting Vision into Frozen Speech Models for Zero-Shot AV-ASRComputer Vision and Pattern Recognition (CVPR), 2023

Paul Hongsuck Seo

Arsha Nagrani

Cordelia Schmid

199

29 Mar 2023

Cross-utterance ASR Rescoring with Graph-based Label PropagationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

Venkatesh Ravichandran

117

27 Mar 2023

Pyramid Multi-branch Fusion DCNN with Multi-Head Self-Attention for Mandarin Speech Recognition

210

23 Mar 2023

I3D: Transformer architectures with input-dependent dynamic depth for speech recognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

Yifan Peng

Jaesong Lee

Shinji Watanabe

215

14 Mar 2023

Probing neural representations of scene perception in a hippocampally dependent task using artificial neural networksComputer Vision and Pattern Recognition (CVPR), 2023

Markus Frey

Christian F. Doeller

Caswell Barry

169

11 Mar 2023

An Overview on Language Models: Recent Developments and OutlookAPSIPA Transactions on Signal and Information Processing (TASIP), 2023

Chengwei Wei

Yun Cheng Wang

Bin Wang

C.-C. Jay Kuo

276

10 Mar 2023

MIXPGD: Hybrid Adversarial Training for Speech Recognition Systems

174

10 Mar 2023

End-to-End Speech Recognition: A SurveyIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023

288

245

03 Mar 2023

Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages

...

398

347

02 Mar 2023

Building High-accuracy Multilingual ASR with Gated Language Experts and Curriculum TrainingAutomatic Speech Recognition & Understanding (ASRU), 2023

...

276

01 Mar 2023

N-best T5: Robust ASR Error Correction using Multiple Input Hypotheses and Constrained Decoding SpaceInterspeech (Interspeech), 2023

Rao Ma

Mark Gales

Kate Knill

Mengjie Qian

251

01 Mar 2023

MoLE : Mixture of Language Experts for Multi-Lingual Automatic Speech RecognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

Yoohwan Kwon

Soo-Whan Chung

MoE

176

27 Feb 2023

Efficient CTC Regularization via Coarse Labels for End-to-End Speech TranslationConference of the European Chapter of the Association for Computational Linguistics (EACL), 2023

Biao Zhang

Barry Haddow

Rico Sennrich

267

21 Feb 2023

A Sidecar Separator Can Convert a Single-Talker Speech Recognition System to a Multi-Talker OneIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

Lingwei Meng

185

20 Feb 2023

Speaker and Language Change Detection using Wav2vec2 and Whisper

Tijn Berns

Nik Vaessen

David A. van Leeuwen

163

18 Feb 2023

Confidence Score Based Speaker Adaptation of Conformer Speech Recognition SystemsIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023

Jiajun Deng

Tianzi Wang

Zengrui Jin

Shujie Hu

158

15 Feb 2023

Knowledge Transfer from Pre-trained Language Models to Cif-based Speech Recognizers via Hierarchical DistillationInterspeech (Interspeech), 2023

Minglun Han

Bo Xu

216

30 Jan 2023

Achieving Timestamp Prediction While Recognizing with Non-Autoregressive End-to-End ASR Model

159

29 Jan 2023

Regeneration Learning: A Learning Paradigm for Data GenerationAAAI Conference on Artificial Intelligence (AAAI), 2023

Xu Tan

Jiang Bian

143

21 Jan 2023

Neural Architecture Search: Insights from 1000 Papers

Katharina Eggensperger

3DV AI4CE

409

192

20 Jan 2023

Two Stage Contextual Word Filtering for Context bias in Unified Streaming and Non-streaming TransducerInterspeech (Interspeech), 2023

214

17 Jan 2023

BayesSpeech: A Bayesian Transformer Network for Automatic Speech Recognition

Will Rieger

BDL UQCV

125

16 Jan 2023