ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1508.01211
  4. Cited By
Listen, Attend and Spell

Listen, Attend and Spell

5 August 2015
William Chan
Navdeep Jaitly
Quoc V. Le
Oriol Vinyals
    RALM
ArXivPDFHTML

Papers citing "Listen, Attend and Spell"

50 / 510 papers shown
Title
MSR-86K: An Evolving, Multilingual Corpus with 86,300 Hours of
  Transcribed Audio for Speech Recognition Research
MSR-86K: An Evolving, Multilingual Corpus with 86,300 Hours of Transcribed Audio for Speech Recognition Research
Song Li
Yongbin You
Xuezhi Wang
Zhengkun Tian
Ke Ding
Guanglu Wan
21
1
0
26 Jun 2024
Token-Weighted RNN-T for Learning from Flawed Data
Token-Weighted RNN-T for Learning from Flawed Data
Gil Keren
Wei Zhou
Ozlem Kalinli
43
0
0
26 Jun 2024
Automatic speech recognition for the Nepali language using CNN,
  bidirectional LSTM and ResNet
Automatic speech recognition for the Nepali language using CNN, bidirectional LSTM and ResNet
Manish Dhakal
Arman Chhetri
Aman Kumar Gupta
Prabin B. Lamichhane
S. Pandey
S. Shakya
AI4TS
27
10
0
25 Jun 2024
InterBiasing: Boost Unseen Word Recognition through Biasing Intermediate
  Predictions
InterBiasing: Boost Unseen Word Recognition through Biasing Intermediate Predictions
Yu Nakagome
Michael Hentschel
42
0
0
21 Jun 2024
Instruction Data Generation and Unsupervised Adaptation for Speech
  Language Models
Instruction Data Generation and Unsupervised Adaptation for Speech Language Models
Vahid Noroozi
Zhehuai Chen
Somshubra Majumdar
Steve Huang
Jagadeesh Balam
Boris Ginsburg
SyDa
36
3
0
18 Jun 2024
Guiding Frame-Level CTC Alignments Using Self-knowledge Distillation
Guiding Frame-Level CTC Alignments Using Self-knowledge Distillation
Eungbeom Kim
Hantae Kim
Kyogu Lee
32
1
0
12 Jun 2024
Dual-Pipeline with Low-Rank Adaptation for New Language Integration in
  Multilingual ASR
Dual-Pipeline with Low-Rank Adaptation for New Language Integration in Multilingual ASR
Yerbolat Khassanov
Zhipeng Chen
Tianfeng Chen
Tze Yuang Chong
Wei Li
Jun Zhang
Lu Lu
Yuxuan Wang
AI4CE
16
0
0
12 Jun 2024
StreamAtt: Direct Streaming Speech-to-Text Translation with
  Attention-based Audio History Selection
StreamAtt: Direct Streaming Speech-to-Text Translation with Attention-based Audio History Selection
Sara Papi
Marco Gaido
Matteo Negri
L. Bentivogli
64
4
0
10 Jun 2024
LoRA-Whisper: Parameter-Efficient and Extensible Multilingual ASR
LoRA-Whisper: Parameter-Efficient and Extensible Multilingual ASR
Zheshu Song
Jianheng Zhuo
Yifan Yang
Ziyang Ma
Shixiong Zhang
Xie Chen
31
9
0
07 Jun 2024
Unveiling the Dynamics of Information Interplay in Supervised Learning
Unveiling the Dynamics of Information Interplay in Supervised Learning
Kun Song
Zhiquan Tan
Bochao Zou
Huimin Ma
Weiran Huang
32
1
0
06 Jun 2024
Joint Optimization of Streaming and Non-Streaming Automatic Speech
  Recognition with Multi-Decoder and Knowledge Distillation
Joint Optimization of Streaming and Non-Streaming Automatic Speech Recognition with Multi-Decoder and Knowledge Distillation
Muhammad Shakeel
Yui Sudo
Yifan Peng
Shinji Watanabe
24
0
0
22 May 2024
Contextualized Automatic Speech Recognition with Dynamic Vocabulary
Contextualized Automatic Speech Recognition with Dynamic Vocabulary
Yui Sudo
Yosuke Fukumoto
Muhammad Shakeel
Yifan Peng
Shinji Watanabe
29
0
0
22 May 2024
Gated Low-rank Adaptation for personalized Code-Switching Automatic
  Speech Recognition on the low-spec devices
Gated Low-rank Adaptation for personalized Code-Switching Automatic Speech Recognition on the low-spec devices
Gwantae Kim
Bokyeung Lee
Donghyeon Kim
Hanseok Ko
OffRL
23
0
0
24 Apr 2024
Transducers with Pronunciation-aware Embeddings for Automatic Speech
  Recognition
Transducers with Pronunciation-aware Embeddings for Automatic Speech Recognition
Hainan Xu
Zhehuai Chen
Fei Jia
Boris Ginsburg
33
0
0
04 Apr 2024
Effective internal language model training and fusion for factorized
  transducer model
Effective internal language model training and fusion for factorized transducer model
Jinxi Guo
Niko Moritz
Yingyi Ma
Frank Seide
Chunyang Wu
Jay Mahadeokar
Ozlem Kalinli
Christian Fuegen
Michael Seltzer
38
1
0
02 Apr 2024
Enhancing Efficiency in Vision Transformer Networks: Design Techniques
  and Insights
Enhancing Efficiency in Vision Transformer Networks: Design Techniques and Insights
Moein Heidari
Reza Azad
Sina Ghorbani Kolahi
René Arimond
Leon Niggemeier
...
Afshin Bozorgpour
Ehsan Khodapanah Aghdam
A. Kazerouni
I. Hacihaliloglu
Dorit Merhof
45
7
0
28 Mar 2024
M$^3$AV: A Multimodal, Multigenre, and Multipurpose Audio-Visual
  Academic Lecture Dataset
M3^33AV: A Multimodal, Multigenre, and Multipurpose Audio-Visual Academic Lecture Dataset
Zhe Chen
Heyang Liu
Wenyi Yu
Guangzhi Sun
Hongcheng Liu
Ji Wu
Chao Zhang
Yu Wang
Yanfeng Wang
VGen
49
1
0
21 Mar 2024
Advanced Long-Content Speech Recognition With Factorized Neural
  Transducer
Advanced Long-Content Speech Recognition With Factorized Neural Transducer
Xun Gong
Yu Wu
Jinyu Li
Shujie Liu
Rui Zhao
Xie Chen
Yanmin Qian
29
6
0
20 Mar 2024
Skipformer: A Skip-and-Recover Strategy for Efficient Speech Recognition
Skipformer: A Skip-and-Recover Strategy for Efficient Speech Recognition
Wenjing Zhu
Sining Sun
Changhao Shan
Peng Fan
Qing Yang
29
1
0
13 Mar 2024
The evaluation of a code-switched Sepedi-English automatic speech
  recognition system
The evaluation of a code-switched Sepedi-English automatic speech recognition system
Amanda Phaladi
T. Modipa
30
0
0
11 Mar 2024
A&B BNN: Add&Bit-Operation-Only Hardware-Friendly Binary Neural Network
A&B BNN: Add&Bit-Operation-Only Hardware-Friendly Binary Neural Network
Ruichen Ma
G. Qiao
Yián Liu
L. Meng
N. Ning
Yang Liu
Shaogang Hu
AAML
MQ
39
3
0
06 Mar 2024
Towards Accurate Lip-to-Speech Synthesis in-the-Wild
Towards Accurate Lip-to-Speech Synthesis in-the-Wild
Sindhu B. Hegde
Rudrabha Mukhopadhyay
C. V. Jawahar
Vinay P. Namboodiri
27
4
0
02 Mar 2024
Post-decoder Biasing for End-to-End Speech Recognition of Multi-turn
  Medical Interview
Post-decoder Biasing for End-to-End Speech Recognition of Multi-turn Medical Interview
Heyang Liu
Yu Wang
Yanfeng Wang
38
0
0
01 Mar 2024
Extreme Encoder Output Frame Rate Reduction: Improving Computational
  Latencies of Large End-to-End Models
Extreme Encoder Output Frame Rate Reduction: Improving Computational Latencies of Large End-to-End Models
Rohit Prabhavalkar
Zhong Meng
Weiran Wang
Adam Stooke
Xingyu Cai
Yanzhang He
Arun Narayanan
Dongseong Hwang
Tara N. Sainath
Pedro J. Moreno
30
8
0
27 Feb 2024
Alternating Weak Triphone/BPE Alignment Supervision from Hybrid Model
  Improves End-to-End ASR
Alternating Weak Triphone/BPE Alignment Supervision from Hybrid Model Improves End-to-End ASR
Jintao Jiang
Yingbo Gao
Mohammad Zeineldeen
Zoltán Tüske
34
0
0
23 Feb 2024
How do Hyenas deal with Human Speech? Speech Recognition and Translation
  with ConfHyena
How do Hyenas deal with Human Speech? Speech Recognition and Translation with ConfHyena
Marco Gaido
Sara Papi
Matteo Negri
L. Bentivogli
31
1
0
20 Feb 2024
Comparison of Conventional Hybrid and CTC/Attention Decoders for
  Continuous Visual Speech Recognition
Comparison of Conventional Hybrid and CTC/Attention Decoders for Continuous Visual Speech Recognition
David Gimeno-Gómez
Carlos David Martínez Hinarejos
32
1
0
20 Feb 2024
An Embarrassingly Simple Approach for LLM with Strong ASR Capacity
An Embarrassingly Simple Approach for LLM with Strong ASR Capacity
Ziyang Ma
Guanrou Yang
Yifan Yang
Zhifu Gao
Jiaming Wang
...
Fan Yu
Qian Chen
Siqi Zheng
Shiliang Zhang
Xie Chen
AuLLM
47
38
0
13 Feb 2024
Self-consistent context aware conformer transducer for speech
  recognition
Self-consistent context aware conformer transducer for speech recognition
Konstantin Kolokolov
Pavel Pekichev
Karthik Raghunathan
22
0
0
09 Feb 2024
Contextualized Automatic Speech Recognition with Attention-Based Bias
  Phrase Boosted Beam Search
Contextualized Automatic Speech Recognition with Attention-Based Bias Phrase Boosted Beam Search
Yui Sudo
Muhammad Shakeel
Yosuke Fukumoto
Yifan Peng
Shinji Watanabe
26
5
0
19 Jan 2024
Improving ASR Contextual Biasing with Guided Attention
Improving ASR Contextual Biasing with Guided Attention
Jiyang Tang
Kwangyoun Kim
Suwon Shon
Felix Wu
Prashant Sridhar
Shinji Watanabe
19
8
0
16 Jan 2024
LCB-net: Long-Context Biasing for Audio-Visual Speech Recognition
LCB-net: Long-Context Biasing for Audio-Visual Speech Recognition
Fan Yu
Haoxu Wang
Xian Shi
Shiliang Zhang
19
3
0
12 Jan 2024
Cross-Speaker Encoding Network for Multi-Talker Speech Recognition
Cross-Speaker Encoding Network for Multi-Talker Speech Recognition
Jiawen Kang
Lingwei Meng
Mingyu Cui
Haohan Guo
Xixin Wu
Xunying Liu
Helen M. Meng
46
5
0
08 Jan 2024
A unified multichannel far-field speech recognition system: combining
  neural beamforming with attention based end-to-end model
A unified multichannel far-field speech recognition system: combining neural beamforming with attention based end-to-end model
Dongdi Zhao
Jianbo Ma
Lu Lu
Jinke Li
Xuan Ji
Lei Zhu
Fuming Fang
Ming-Yu Liu
Feijun Jiang
10
1
0
05 Jan 2024
CTC Blank Triggered Dynamic Layer-Skipping for Efficient CTC-based
  Speech Recognition
CTC Blank Triggered Dynamic Layer-Skipping for Efficient CTC-based Speech Recognition
Junfeng Hou
Peiyao Wang
Jincheng Zhang
Meng-Da Yang
Minwei Feng
Jingcheng Yin
27
1
0
04 Jan 2024
BLSTM-Based Confidence Estimation for End-to-End Speech Recognition
BLSTM-Based Confidence Estimation for End-to-End Speech Recognition
A. Ogawa
Naohiro Tawara
Takatomo Kano
Marc Delcroix
40
4
0
22 Dec 2023
Speaker Mask Transformer for Multi-talker Overlapped Speech Recognition
Speaker Mask Transformer for Multi-talker Overlapped Speech Recognition
Peng Shen
Xugang Lu
Hisashi Kawai
27
1
0
18 Dec 2023
Conformer-Based Speech Recognition On Extreme Edge-Computing Devices
Conformer-Based Speech Recognition On Extreme Edge-Computing Devices
Mingbin Xu
Alex Jin
Sicheng Wang
Mu Su
Tim Ng
...
Shiyi Han
Zhihong Lei
Yaqiao Deng
Zhen Huang
Mahesh Krishnamoorthy
21
4
0
16 Dec 2023
USM-Lite: Quantization and Sparsity Aware Fine-tuning for Speech
  Recognition with Universal Speech Models
USM-Lite: Quantization and Sparsity Aware Fine-tuning for Speech Recognition with Universal Speech Models
Shaojin Ding
David Qiu
David Rim
Yanzhang He
Oleg Rybakov
...
Tara N. Sainath
Zhonglin Han
Jian Li
Amir Yazdanbakhsh
Shivani Agrawal
MQ
26
9
0
13 Dec 2023
D4AM: A General Denoising Framework for Downstream Acoustic Models
D4AM: A General Denoising Framework for Downstream Acoustic Models
H. Wang
Yu Tsao
Hsin-Min Wang
Chu-Song Chen
13
4
0
28 Nov 2023
Weak Alignment Supervision from Hybrid Model Improves End-to-end ASR
Weak Alignment Supervision from Hybrid Model Improves End-to-end ASR
Jintao Jiang
Yingbo Gao
Zoltán Tüske
21
1
0
24 Nov 2023
LIP-RTVE: An Audiovisual Database for Continuous Spanish in the Wild
LIP-RTVE: An Audiovisual Database for Continuous Spanish in the Wild
David Gimeno-Gómez
Carlos David Martínez Hinarejos
11
8
0
21 Nov 2023
Phonological Level wav2vec2-based Mispronunciation Detection and
  Diagnosis Method
Phonological Level wav2vec2-based Mispronunciation Detection and Diagnosis Method
M. Shahin
Julien Epps
Beena Ahmed
16
1
0
13 Nov 2023
End-to-End Single-Channel Speaker-Turn Aware Conversational Speech
  Translation
End-to-End Single-Channel Speaker-Turn Aware Conversational Speech Translation
Juan Pablo Zuluaga
Zhaocheng Huang
Xing Niu
Rohit Paturi
S. Srinivasan
Prashant Mathur
Brian Thompson
Marcello Federico
BDL
27
2
0
01 Nov 2023
Distil-Whisper: Robust Knowledge Distillation via Large-Scale Pseudo
  Labelling
Distil-Whisper: Robust Knowledge Distillation via Large-Scale Pseudo Labelling
Sanchit Gandhi
Patrick von Platen
Alexander M. Rush
VLM
19
51
0
01 Nov 2023
Key Frame Mechanism For Efficient Conformer Based End-to-end Speech
  Recognition
Key Frame Mechanism For Efficient Conformer Based End-to-end Speech Recognition
Peng Fan
Changhao Shan
Sining Sun
Qing Yang
Jianwei Zhang
25
3
0
23 Oct 2023
Tailoring Adversarial Attacks on Deep Neural Networks for Targeted Class Manipulation Using DeepFool Algorithm
Tailoring Adversarial Attacks on Deep Neural Networks for Targeted Class Manipulation Using DeepFool Algorithm
S. M. Fazle
J. Mondal
Meem Arafat Manab
Xi Xiao
Sarfaraz Newaz
AAML
27
0
0
18 Oct 2023
End-to-End real time tracking of children's reading with pointer network
End-to-End real time tracking of children's reading with pointer network
Vishal Sunder
Beulah Karrolla
Eric Fosler-Lussier
10
0
0
17 Oct 2023
Correction Focused Language Model Training for Speech Recognition
Correction Focused Language Model Training for Speech Recognition
Yingyi Ma
Zhe Liu
Ozlem Kalinli
KELM
25
3
0
17 Oct 2023
Personalization of CTC-based End-to-End Speech Recognition Using
  Pronunciation-Driven Subword Tokenization
Personalization of CTC-based End-to-End Speech Recognition Using Pronunciation-Driven Subword Tokenization
Zhihong Lei
Ernest Pusateri
Shiyi Han
Leo Liu
Mingbin Xu
...
R. Travadi
Youyuan Zhang
Mirko Hannemann
Man-Hung Siu
Zhen Huang
23
9
0
16 Oct 2023
Previous
12345...91011
Next