v1v2 (latest)

Listen, Attend and Spell

IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2015

5 August 2015

Papers citing "Listen, Attend and Spell"

50 / 1,064 papers shown

Training Spiking Neural Networks Using Lessons From Deep LearningProceedings of the IEEE (Proc. IEEE), 2021

Wei D. Lu

550

676

27 Sep 2021

ChannelAugment: Improving generalization of multi-channel ASR by training with input channel randomizationAutomatic Speech Recognition & Understanding (ASRU), 2021

101

23 Sep 2021

Wav-BERT: Cooperative Acoustic and Linguistic Representation Learning for Low-Resource Speech Recognition

Xiaodan Liang

202

19 Sep 2021

Dual-Encoder Architecture with Encoder Selection for Joint Close-Talk and Far-Talk Speech Recognition

17 Sep 2021

PDAugment: Data Augmentation by Pitch and Duration Adjustments for Automatic Lyrics Transcription

Xu Tan

133

16 Sep 2021

Utterance-level neural confidence measure for end-to-end children speech recognition

W. Liu

Tan Lee

110

16 Sep 2021

Performance-Efficiency Trade-offs in Unsupervised Pre-training for Speech Recognition

Kwangyoun Kim

163

14 Sep 2021

Self-Attention Channel Combinator Frontend for End-to-End Multichannel Far-field Speech RecognitionInterspeech (Interspeech), 2021

192

10 Sep 2021

Tree-constrained Pointer Generator for End-to-end Contextual Speech RecognitionAutomatic Speech Recognition & Understanding (ASRU), 2021

Guangzhi Sun

Chao Zhang

P. Woodland

217

01 Sep 2021

Greenformers: Improving Computation and Memory Efficiency in Transformer Models via Low-Rank Approximation

Samuel Cahyawijaya

192

24 Aug 2021

Reducing Exposure Bias in Training Recurrent Neural Network TransducersInterspeech (Interspeech), 2021

130

24 Aug 2021

Multilingual Speech Recognition for Low-Resource Indian Languages using Multi-Task conformer

Krishna D N Freshworks

22 Aug 2021

A Dual-Decoder Conformer for Multilingual Speech Recognition

Krishna D N Freshworks

22 Aug 2021

Using Large Pre-Trained Models with Cross-Modal Attention for Multi-Modal Emotion Recognition

Krishna D N Freshworks

134

22 Aug 2021

Generalizing RNN-Transducer to Out-Domain Audio via Sparse Self-Attention LayersInterspeech (Interspeech), 2021

Juntae Kim

Jee-Hye Lee

189

22 Aug 2021

A Light-weight contextual spelling correction model for customizing transducer-based speech recognition systems

148

17 Aug 2021

SpecMix : A Mixed Sample Data Augmentation method for Training withTime-Frequency Domain FeaturesInterspeech (Interspeech), 2021

Gwantae Kim

D. Han

Hanseok Ko

138

06 Aug 2021

Knowledge Distillation from BERT Transformer to Speech Transformer for Intent ClassificationInterspeech (Interspeech), 2021

Yiding Jiang

Bidisha Sharma

Maulik C. Madhavi

Haizhou Li

162

05 Aug 2021

Adversarial Data Augmentation for Disordered Speech Recognition

Zengrui Jin

119

02 Aug 2021

Facetron: A Multi-speaker Face-to-Speech Model based on Cross-modal Latent RepresentationsEuropean Signal Processing Conference (EUSIPCO), 2021

302

26 Jul 2021

Ensemble of Convolution Neural Networks on Heterogeneous Signals for Sleep Stage ScoringSocial Science Research Network (SSRN), 2021

Enrique Fernández-Blanco

C. Fernandez-Lozano

A. Pazos

Daniel Rivero

120

23 Jul 2021

VAD-free Streaming Hybrid CTC/Attention ASR for Unsegmented RecordingInterspeech (Interspeech), 2021

Hirofumi Inaguma

Tatsuya Kawahara

189

15 Jul 2021

A Configurable Multilingual Model is All You Need to Recognize All Languages

230

13 Jul 2021

ReconVAT: A Semi-Supervised Automatic Music Transcription Framework for Low-Resource Real-World Data

K. Cheuk

Dorien Herremans

Li Su

365

11 Jul 2021

On lattice-free boosted MMI training of HMM and CTC-based full-context ASR modelsAutomatic Speech Recognition & Understanding (ASRU), 2021

278

09 Jul 2021

End-to-End Rich Transcription-Style Automatic Speech Recognition with Semi-Supervised Learning

128

07 Jul 2021

Instant One-Shot Word-Learning for Context-Specific Neural Sequence-to-Sequence Speech Recognition

179

05 Jul 2021

Relaxed Attention: A Simple Method to Boost Performance of End-to-End Automatic Speech Recognition

151

02 Jul 2021

What do End-to-End Speech Models Learn about Speaker, Language and Channel Information? A Layer-wise and Neuron-level Analysis

Shammur A. Chowdhury

Nadir Durrani

Ahmed M. Ali

366

01 Jul 2021

On joint training with interfaces for spoken language understandingInterspeech (Interspeech), 2021

206

30 Jun 2021

A Survey on Neural Speech Synthesis

Xu Tan

344

435

29 Jun 2021

Where are we in semantic concept extraction for Spoken Language Understanding?

196

24 Jun 2021

Towards Automatic Speech to Sign Language Generation

Parul Kapoor

Rudrabha Mukhopadhyay

155

24 Jun 2021

Efficient Conformer with Prob-Sparse Attention Mechanism for End-to-EndSpeech Recognition

Xiong Wang

Sining Sun

Lei Xie

Long Ma

113

17 Jun 2021

Layer Pruning on Demand with Intermediate CTC

Jaesong Lee

Jingu Kang

Shinji Watanabe

130

17 Jun 2021

Momentum Pseudo-Labeling for Semi-Supervised Speech Recognition

236

16 Jun 2021

Attention-Based Keyword Localisation in Speech using Visual Grounding

Kayode Olaleye

Herman Kamper

108

16 Jun 2021

SynthASR: Unlocking Synthetic Data for Speech RecognitionInterspeech (Interspeech), 2021

A. Fazel

Wei Yang

Yulan Liu

Roberto Barra-Chicote

169

14 Jun 2021

Improving RNN-T ASR Performance with Date-Time and Location AwarenessWorkshop on Time-Delay Systems (TS), 2021

116

11 Jun 2021

Raw Waveform Encoder with Multi-Scale Globally Attentive Locally Recurrent Networks for End-to-End Speech RecognitionInterspeech (Interspeech), 2021

124

08 Jun 2021

Data Augmentation Methods for End-to-end Speech Recognition on Distant-Talk ScenariosInterspeech (Interspeech), 2021

120

07 Jun 2021

Approximate Fixed-Points in Recurrent Neural Networks

Zhengxiong Wang

Anton Ragni

04 Jun 2021

Minimum Word Error Rate Training with Language Model Fusion for End-to-End Speech RecognitionInterspeech (Interspeech), 2021

Xie Chen

145

04 Jun 2021

Towards One Model to Rule All: Multilingual Strategy for Dialectal Code-Switching Arabic ASRInterspeech (Interspeech), 2021

267

31 May 2021

Listen with Intent: Improving Speech Recognition with Audio-to-Intent Front-EndInterspeech (Interspeech), 2021

175

14 May 2021

Exploring CTC Based End-to-End Techniques for Myanmar Speech Recognition

Khin Me Me Chit

Laet Laet Lin

108

13 May 2021

Quantifying and Maximizing the Benefits of Back-End Noise Adaption on Attention-Based Speech Recognition Models

Coleman Hooper

Thierry Tambe

Gu-Yeon Wei

108

03 May 2021

On the limit of English conversational speech recognitionInterspeech (Interspeech), 2021

Zoltán Tüske

G. Saon

Brian Kingsbury

183

03 May 2021

On Addressing Practical Challenges for RNN-TransducerAutomatic Speech Recognition & Understanding (ASRU), 2021

262

27 Apr 2021

Bridging the gap between streaming and non-streaming ASR systems bydistilling ensembles of CTC and RNN-T modelsInterspeech (Interspeech), 2021

139

25 Apr 2021