v1v2 (latest)

Listen, Attend and Spell

IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2015

5 August 2015

Papers citing "Listen, Attend and Spell"

50 / 1,064 papers shown

SpeeChain: A Speech Toolkit for Large-Scale Machine Speech Chain

188

08 Jan 2023

Object Segmentation with Audio Context

189

04 Jan 2023

Sample-Efficient Unsupervised Domain Adaptation of Speech Recognition Systems A case study for Modern GreekIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2022

Georgios Paraskevopoulos

Theodoros Kouzelis

Georgios Rouvalis

Athanasios Katsamanis

Vassilis Katsouros

Alexandros Potamianos

VLM

280

31 Dec 2022

4D ASR: Joint modeling of CTC, Attention, Transducer, and Mask-Predict decodersInterspeech (Interspeech), 2022

Jiatong Shi

116

21 Dec 2022

Attention as a Guide for Simultaneous Speech TranslationAnnual Meeting of the Association for Computational Linguistics (ACL), 2022

Sara Papi

Matteo Negri

Marco Turchi

208

15 Dec 2022

GAMMA: Generative Augmentation for Attentive Marine Debris Detection

Vaishnavi Khindkar

Janhavi Khindkar

ViT

100

07 Dec 2022

Lattice-Free Sequence Discriminative Training for Phoneme-Based Neural TransducersIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

228

07 Dec 2022

Learning the joint distribution of two sequences using little or no paired data

262

06 Dec 2022

LMEC: Learnable Multiplicative Absolute Position Embedding Based Conformer for Speech Recognition

Yuguang Yang

Yu Pan

Jingjing Yin

Heng Lu

251

05 Dec 2022

Continual Learning for On-Device Speech Recognition using Disentangled ConformersIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

284

02 Dec 2022

Neural Transducer Training: Reduced Memory Consumption with Sample-wise ComputationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

Stefan Braun

Erik McDermott

Roger Hsiao

147

29 Nov 2022

Unsupervised Model-based speaker adaptation of end-to-end lattice-free MMI model for speech recognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

242

17 Nov 2022

Continuous Soft Pseudo-Labeling in ASR

272

11 Nov 2022

Massively Multilingual ASR on 70 Languages: Tokenization, Architecture, and Generalization CapabilitiesIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

Ozlem Kalinli

226

10 Nov 2022

Adaptive Multi-Corpora Language Model Training for Speech RecognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

Yingyi Ma

Zhe Liu

Xuedong Zhang

188

09 Nov 2022

Peak-First CTC: Reducing the Peak Latency of CTC Models by Applying Peak-First RegularizationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

118

07 Nov 2022

Deliberation Networks and How to Train Them

Qingyun Dou

Mark Gales

115

06 Nov 2022

Multi-blank Transducers for Speech RecognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

Hainan Xu

Fei Jia

Somshubra Majumdar

Shinji Watanabe

Boris Ginsburg

215

04 Nov 2022

Once-for-All Sequence Compression for Self-Supervised Speech ModelsIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

Hsuan-Jui Chen

Yen Meng

Hung-yi Lee

289

04 Nov 2022

The ISCSLP 2022 Intelligent Cockpit Speech Recognition Challenge (ICSRC): Dataset, Tracks, Baseline and ResultsInternational Symposium on Chinese Spoken Language Processing (ISCSLP), 2022

Ao Zhang

Longbiao Wang

Hui Bu

Binbin Zhang

Wei Chen

Xin Xu

201

03 Nov 2022

Losses Can Be Blessings: Routing Self-Supervised Speech Representations Towards Efficient Multilingual and Multitask Speech ProcessingNeural Information Processing Systems (NeurIPS), 2022

Kaizhi Qian

354

02 Nov 2022

Internal Language Model Estimation based Adaptive Language Model Fusion for Domain AdaptationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

Rao Ma

170

02 Nov 2022

Fast-U2++: Fast and Accurate End-to-End Speech Recognition in Joint CTC/Attention FramesIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

BinBin Zhang

02 Nov 2022

Conversation-oriented ASR with multi-look-ahead CBS architectureIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

238

02 Nov 2022

InterMPL: Momentum Pseudo-Labeling with Intermediate CTC LossIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

284

02 Nov 2022

TrimTail: Low-Latency Streaming ASR with Simple but Effective Spectrogram-Level Length PenaltyIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

Xingcheng Song

Di Wu

Zhiyong Wu

Binbin Zhang

241

01 Nov 2022

Speech-text based multi-modal training with bidirectional attention for improved speech recognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

Sheng Li

178

01 Nov 2022

Joint Audio/Text Training for Transformer Rescorer of Streaming Speech RecognitionConference on Empirical Methods in Natural Language Processing (EMNLP), 2022

Ozlem Kalinli

195

31 Oct 2022

Structured State Space Decoder for Speech Recognition and SynthesisIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

Koichi Miyazaki

Masato Murata

Tomoki Koriyama

258

31 Oct 2022

FusionFormer: Fusing Operations in Transformer for Efficient Streaming Speech Recognition

Xingcheng Song

Di Wu

Binbin Zhang

Zhiyong Wu

...

133

31 Oct 2022

Modular Hybrid Autoregressive TransducerSpoken Language Technology Workshop (SLT), 2022

...

Bhuvana Ramabhadran

188

31 Oct 2022

Blank Collapse: Compressing CTC emission for the faster decodingInterspeech (Interspeech), 2022

237

31 Oct 2022

Partitioned Gradient Matching-based Data Subset Selection for Compute-Efficient Robust ASR TrainingConference on Empirical Methods in Natural Language Processing (EMNLP), 2022

Ganesh Ramakrishnan

162

30 Oct 2022

BERT Meets CTC: New Formulation of End-to-End Speech Recognition with Pre-trained Masked Language ModelConference on Empirical Methods in Natural Language Processing (EMNLP), 2022

255

29 Oct 2022

Accelerating RNN-T Training and Inference Using CTC guidanceIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

Zhehuai Chen

205

29 Oct 2022

Filter and evolve: progressive pseudo label refining for semi-supervised automatic speech recognition

140

28 Oct 2022

Random Utterance Concatenation Based Data Augmentation for Improving Short-video Speech RecognitionInterspeech (Interspeech), 2022

Yerbolat Khassanov

151

28 Oct 2022

Token-level Sequence Labeling for Spoken Language Understanding using Compositional End-to-End ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2022

124

27 Oct 2022

Monotonic segmental attention for automatic speech recognitionSpoken Language Technology Workshop (SLT), 2022

129

26 Oct 2022

Linguistic-Enhanced Transformer with CTC Embedding for Speech RecognitionInternational Conference on Mobile Ad-hoc and Sensor Networks (MSN), 2022

105

25 Oct 2022

ESB: A Benchmark For Multi-Domain End-to-End Speech Recognition

Sanchit Gandhi

Patrick von Platen

Alexander M. Rush

141

24 Oct 2022

Optimizing Bilingual Neural Transducer with Synthetic Code-switching Text Generation

Thien Nguyen

Nathalie Tran

Liuhui Deng

Thiago Fraga da Silva

...

244

21 Oct 2022

Improving Semi-supervised End-to-end Automatic Speech Recognition using CycleGAN and Inter-domain LossesSpoken Language Technology Workshop (SLT), 2022

C. Li

Ngoc Thang Vu

144

20 Oct 2022

Anchored Speech Recognition with Neural TransducersIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

Ozlem Kalinli

237

20 Oct 2022

End-to-End Integration of Speech Recognition, Dereverberation, Beamforming, and Self-Supervised Learning RepresentationSpoken Language Technology Workshop (SLT), 2022

243

19 Oct 2022

Helpful Neighbors: Leveraging Neighbors in Geographic Feature PronunciationTransactions of the Association for Computational Linguistics (TACL), 2022

194

18 Oct 2022

Acoustic-aware Non-autoregressive Spell Correction with Mask Sample Decoding

183

16 Oct 2022

A Policy-based Approach to the SpecAugment Method for Low Resource E2E ASRAsia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 2022

131

16 Oct 2022

On Compressing Sequences for Self-Supervised Speech ModelsSpoken Language Technology Workshop (SLT), 2022

Jiatong Shi

Hao Tang

195

13 Oct 2022

An Experimental Study on Private Aggregation of Teacher Ensemble Learning for End-to-End Speech RecognitionSpoken Language Technology Workshop (SLT), 2022

Chao-Han Huck Yang

I-Fan Chen

A. Stolcke

Sabato Marco Siniscalchi

Chin-Hui Lee

173

11 Oct 2022