ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1506.07503
  4. Cited By
Attention-Based Models for Speech Recognition

Attention-Based Models for Speech Recognition

24 June 2015
J. Chorowski
Dzmitry Bahdanau
Dmitriy Serdyuk
Kyunghyun Cho
Yoshua Bengio
ArXivPDFHTML

Papers citing "Attention-Based Models for Speech Recognition"

50 / 313 papers shown
Title
A Deep-Learning Intelligent System Incorporating Data Augmentation for
  Short-Term Voltage Stability Assessment of Power Systems
A Deep-Learning Intelligent System Incorporating Data Augmentation for Short-Term Voltage Stability Assessment of Power Systems
Yang Li
Meng Zhang
C. L. P. Chen
31
128
0
05 Dec 2021
Global-Local Attention for Emotion Recognition
Global-Local Attention for Emotion Recognition
Nhat Le
Khanh Nguyen
A. Nguyen
H. Le
CVBM
25
37
0
07 Nov 2021
Effective Cross-Utterance Language Modeling for Conversational Speech
  Recognition
Effective Cross-Utterance Language Modeling for Conversational Speech Recognition
Bi-Cheng Yan
Hsin-Wei Wang
Shih-Hsuan Chiu
Hsuan-Sheng Chiu
Berlin Chen
18
1
0
05 Nov 2021
Context-Aware Transformer Transducer for Speech Recognition
Context-Aware Transformer Transducer for Speech Recognition
Feng-Ju Chang
Jing Liu
Martin H. Radfar
Athanasios Mouchtaris
M. Omologo
Ariya Rastrow
Siegfried Kunzmann
13
79
0
05 Nov 2021
Sequence-to-Sequence Modeling for Action Identification at High Temporal
  Resolution
Sequence-to-Sequence Modeling for Action Identification at High Temporal Resolution
Aakash Kaku
Kangning Liu
A. Parnandi
H. Rajamohan
Kannan Venkataramanan
Anita Venkatesan
Audre Wirtanen
Natasha Pandit
Heidi M. Schambra
C. Fernandez‐Granda
24
5
0
03 Nov 2021
Exploring Non-Autoregressive End-To-End Neural Modeling For English
  Mispronunciation Detection And Diagnosis
Exploring Non-Autoregressive End-To-End Neural Modeling For English Mispronunciation Detection And Diagnosis
Hsin-Wei Wang
Bi-Cheng Yan
Hsuan-Sheng Chiu
Yung-Chang Hsu
Berlin Chen
16
7
0
01 Nov 2021
Discrete Acoustic Space for an Efficient Sampling in Neural
  Text-To-Speech
Discrete Acoustic Space for an Efficient Sampling in Neural Text-To-Speech
Mu-Wei Li
Jonas Rohnke
A. Bonafonte
Mateusz Lajszczak
Trevor Wood
DRL
17
2
0
24 Oct 2021
Neural Dubber: Dubbing for Videos According to Scripts
Neural Dubber: Dubbing for Videos According to Scripts
Chenxu Hu
Qiao Tian
Tingle Li
Yuping Wang
Yuxuan Wang
Hang Zhao
DiffM
VGen
36
39
0
15 Oct 2021
MEDUSA: Multi-scale Encoder-Decoder Self-Attention Deep Neural Network
  Architecture for Medical Image Analysis
MEDUSA: Multi-scale Encoder-Decoder Self-Attention Deep Neural Network Architecture for Medical Image Analysis
Hossein Aboutalebi
Maya Pavlova
Hayden Gunraj
M. Shafiee
A. Sabri
Amer Alaref
Alexander Wong
20
17
0
12 Oct 2021
Hierarchical Conditional End-to-End ASR with CTC and Multi-Granular
  Subword Units
Hierarchical Conditional End-to-End ASR with CTC and Multi-Granular Subword Units
Yosuke Higuchi
Keita Karube
Tetsuji Ogawa
Tetsunori Kobayashi
16
22
0
08 Oct 2021
KaraSinger: Score-Free Singing Voice Synthesis with VQ-VAE using
  Mel-spectrograms
KaraSinger: Score-Free Singing Voice Synthesis with VQ-VAE using Mel-spectrograms
Chien-Feng Liao
Jen-Yu Liu
Yi-Hsuan Yang
19
5
0
08 Oct 2021
Factorized Neural Transducer for Efficient Language Model Adaptation
Factorized Neural Transducer for Efficient Language Model Adaptation
Xie Chen
Zhong Meng
S. Parthasarathy
Jinyu Li
18
39
0
27 Sep 2021
KOHTD: Kazakh Offline Handwritten Text Dataset
KOHTD: Kazakh Offline Handwritten Text Dataset
N. Toiganbayeva
M. Kasem
Galymzhan Abdimanap
K. Bostanbekov
Abdelrahman Abdallah
Anel N. Alimova
D. Nurseitov
16
23
0
22 Sep 2021
iRNN: Integer-only Recurrent Neural Network
iRNN: Integer-only Recurrent Neural Network
Eyyub Sari
Vanessa Courville
V. Nia
MQ
45
4
0
20 Sep 2021
Neural HMMs are all you need (for high-quality attention-free TTS)
Neural HMMs are all you need (for high-quality attention-free TTS)
Shivam Mehta
Éva Székely
Jonas Beskow
G. Henter
25
18
0
30 Aug 2021
A Comparison of Deep Saliency Map Generators on Multispectral Data in
  Object Detection
A Comparison of Deep Saliency Map Generators on Multispectral Data in Object Detection
Jens Bayer
David Munch
Michael Arens
3DPC
30
3
0
26 Aug 2021
Reducing Exposure Bias in Training Recurrent Neural Network Transducers
Reducing Exposure Bias in Training Recurrent Neural Network Transducers
Xiaodong Cui
Brian Kingsbury
G. Saon
David Haws
Zoltán Tüske
16
5
0
24 Aug 2021
A3GC-IP: Attention-Oriented Adjacency Adaptive Recurrent Graph
  Convolutions for Human Pose Estimation from Sparse Inertial Measurements
A3GC-IP: Attention-Oriented Adjacency Adaptive Recurrent Graph Convolutions for Human Pose Estimation from Sparse Inertial Measurements
Patrik Puchert
Timo Ropinski
3DH
17
3
0
23 Jul 2021
Machine Learning for Stuttering Identification: Review, Challenges and
  Future Directions
Machine Learning for Stuttering Identification: Review, Challenges and Future Directions
S. A. Sheikh
Md. Sahidullah
F. Hirsch
Slim Ouni
26
48
0
08 Jul 2021
Normalizing Flow based Hidden Markov Models for Classification of Speech
  Phones with Explainability
Normalizing Flow based Hidden Markov Models for Classification of Speech Phones with Explainability
Anubhab Ghosh
Antoine Honoré
Dong Liu
G. Henter
S. Chatterjee
9
5
0
01 Jul 2021
A Survey on Neural Speech Synthesis
A Survey on Neural Speech Synthesis
Xu Tan
Tao Qin
Frank Soong
Tie-Yan Liu
AI4TS
18
352
0
29 Jun 2021
GANSpeech: Adversarial Training for High-Fidelity Multi-Speaker Speech
  Synthesis
GANSpeech: Adversarial Training for High-Fidelity Multi-Speaker Speech Synthesis
Jinhyeok Yang
Jaesung Bae
Taejun Bak
Young-Ik Kim
Hoon-Young Cho
26
36
0
29 Jun 2021
Streaming end-to-end speech recognition with jointly trained neural
  feature enhancement
Streaming end-to-end speech recognition with jointly trained neural feature enhancement
Chanwoo Kim
Abhinav Garg
Dhananjaya N. Gowda
Seongkyu Mun
C. Han
AuLLM
18
6
0
04 May 2021
On the limit of English conversational speech recognition
On the limit of English conversational speech recognition
Zoltán Tüske
G. Saon
Brian Kingsbury
19
50
0
03 May 2021
Large-Scale Self- and Semi-Supervised Learning for Speech Translation
Large-Scale Self- and Semi-Supervised Learning for Speech Translation
Changhan Wang
Anne Wu
J. Pino
Alexei Baevski
Michael Auli
Alexis Conneau
SSL
31
44
0
14 Apr 2021
Timers and Such: A Practical Benchmark for Spoken Language Understanding
  with Numbers
Timers and Such: A Practical Benchmark for Spoken Language Understanding with Numbers
Loren Lugosch
Piyush Papreja
Mirco Ravanelli
A. Heba
Titouan Parcollet
19
12
0
04 Apr 2021
Attention Forcing for Machine Translation
Attention Forcing for Machine Translation
Qingyun Dou
Yiting Lu
Potsawee Manakul
Xixin Wu
Mark J. F. Gales
23
7
0
02 Apr 2021
Residual Energy-Based Models for End-to-End Speech Recognition
Residual Energy-Based Models for End-to-End Speech Recognition
Qiujia Li
Yu Zhang
Bo-wen Li
Liangliang Cao
P. Woodland
23
13
0
25 Mar 2021
Learning Word-Level Confidence For Subword End-to-End ASR
Learning Word-Level Confidence For Subword End-to-End ASR
David Qiu
Qiujia Li
Yanzhang He
Yu Zhang
Bo-wen Li
...
Deepti Bhatia
Wei Li
Ke Hu
Tara N. Sainath
Ian McGraw
24
32
0
11 Mar 2021
Enhanced 3D Human Pose Estimation from Videos by using Attention-Based
  Neural Network with Dilated Convolutions
Enhanced 3D Human Pose Estimation from Videos by using Attention-Based Neural Network with Dilated Convolutions
Ruixu Liu
Ju Shen
He-Nan Wang
C. L. P. Chen
S. Cheung
V. Asari
3DH
23
29
0
04 Mar 2021
End-to-end acoustic modelling for phone recognition of young readers
End-to-end acoustic modelling for phone recognition of young readers
Lucile Gelin
Morgane Daniel
J. Pinquier
Thomas Pellegrini
16
13
0
04 Mar 2021
Video Sentiment Analysis with Bimodal Information-augmented Multi-Head
  Attention
Video Sentiment Analysis with Bimodal Information-augmented Multi-Head Attention
Ting-Wei Wu
Jun-jie Peng
Wenqiang Zhang
Huiran Zhang
Chuan Ma
Yansong Huang
24
84
0
03 Mar 2021
Neural Code Summarization
Neural Code Summarization
Piyush Shrivastava
20
2
0
26 Feb 2021
Revisiting Classification Perspective on Scene Text Recognition
Revisiting Classification Perspective on Scene Text Recognition
Hongxiang Cai
Jun Sun
Yichao Xiong
16
10
0
22 Feb 2021
Leveraging Acoustic and Linguistic Embeddings from Pretrained speech and
  language Models for Intent Classification
Leveraging Acoustic and Linguistic Embeddings from Pretrained speech and language Models for Intent Classification
Bidisha Sharma
Maulik C. Madhavi
Haizhou Li
16
19
0
15 Feb 2021
VARA-TTS: Non-Autoregressive Text-to-Speech Synthesis based on Very Deep
  VAE with Residual Attention
VARA-TTS: Non-Autoregressive Text-to-Speech Synthesis based on Very Deep VAE with Residual Attention
Peng Liu
Yuewen Cao
Songxiang Liu
Na Hu
Guangzhi Li
Chao Weng
Dan Su
34
22
0
12 Feb 2021
GTA: Global Temporal Attention for Video Action Understanding
GTA: Global Temporal Attention for Video Action Understanding
Bo He
Xitong Yang
Zuxuan Wu
Hao Chen
Ser-Nam Lim
Abhinav Shrivastava
ViT
33
27
0
15 Dec 2020
A review of on-device fully neural end-to-end automatic speech
  recognition algorithms
A review of on-device fully neural end-to-end automatic speech recognition algorithms
Chanwoo Kim
Dhananjaya N. Gowda
Dongsoo Lee
Jiyeon Kim
Ankur Kumar
Sungsoo Kim
Abhinav Garg
C. Han
19
27
0
14 Dec 2020
Less Is More: Improved RNN-T Decoding Using Limited Label Context and
  Path Merging
Less Is More: Improved RNN-T Decoding Using Limited Label Context and Path Merging
Rohit Prabhavalkar
Yanzhang He
David Rybach
S. Campbell
A. Narayanan
Trevor Strohman
Tara N. Sainath
41
35
0
12 Dec 2020
Deep Learning Approach for Matrix Completion Using Manifold Learning
Deep Learning Approach for Matrix Completion Using Manifold Learning
Saeid Mehrdad
M. Kahaei
16
6
0
11 Dec 2020
End-to-end Handwritten Paragraph Text Recognition Using a Vertical
  Attention Network
End-to-end Handwritten Paragraph Text Recognition Using a Vertical Attention Network
Denis Coquenet
Clément Chatelain
Thierry Paquet
AI4TS
35
77
0
07 Dec 2020
Unsupervised Domain Adaptation for Speech Recognition via Uncertainty
  Driven Self-Training
Unsupervised Domain Adaptation for Speech Recognition via Uncertainty Driven Self-Training
Sameer Khurana
Niko Moritz
Takaaki Hori
Jonathan Le Roux
16
54
0
26 Nov 2020
Streaming end-to-end multi-talker speech recognition
Streaming end-to-end multi-talker speech recognition
Liang Lu
Naoyuki Kanda
Jinyu Li
Y. Gong
11
41
0
26 Nov 2020
Attention Aware Cost Volume Pyramid Based Multi-view Stereo Network for
  3D Reconstruction
Attention Aware Cost Volume Pyramid Based Multi-view Stereo Network for 3D Reconstruction
Anzhu Yu
Wenyue Guo
Bing Liu
Xin Chen
Xin Eric Wang
Xuefeng Cao
Bingchuan Jiang
3DV
11
64
0
25 Nov 2020
Using Synthetic Audio to Improve The Recognition of Out-Of-Vocabulary
  Words in End-To-End ASR Systems
Using Synthetic Audio to Improve The Recognition of Out-Of-Vocabulary Words in End-To-End ASR Systems
Xianrui Zheng
Yulan Liu
Deniz Gunceler
D. Willett
17
78
0
23 Nov 2020
Deep Shallow Fusion for RNN-T Personalization
Deep Shallow Fusion for RNN-T Personalization
Duc Le
Gil Keren
Julian Chan
Jay Mahadeokar
Christian Fuegen
M. Seltzer
21
77
0
16 Nov 2020
Wave-Tacotron: Spectrogram-free end-to-end text-to-speech synthesis
Wave-Tacotron: Spectrogram-free end-to-end text-to-speech synthesis
Ron J. Weiss
RJ Skerry-Ryan
Eric Battenberg
Soroosh Mariooryad
Diederik P. Kingma
19
97
0
06 Nov 2020
A Multi-Channel Temporal Attention Convolutional Neural Network Model
  for Environmental Sound Classification
A Multi-Channel Temporal Attention Convolutional Neural Network Model for Environmental Sound Classification
You Wang
Chuyao Feng
David V. Anderson
11
17
0
04 Nov 2020
Internal Language Model Estimation for Domain-Adaptive End-to-End Speech
  Recognition
Internal Language Model Estimation for Domain-Adaptive End-to-End Speech Recognition
Zhong Meng
S. Parthasarathy
Eric Sun
Yashesh Gaur
Naoyuki Kanda
Liang Lu
Xie Chen
Rui Zhao
Jinyu Li
Y. Gong
AuLLM
19
107
0
03 Nov 2020
Cascaded encoders for unifying streaming and non-streaming ASR
Cascaded encoders for unifying streaming and non-streaming ASR
A. Narayanan
Tara N. Sainath
Ruoming Pang
Jiahui Yu
Chung-Cheng Chiu
Rohit Prabhavalkar
Ehsan Variani
Trevor Strohman
AuLLM
6
85
0
27 Oct 2020
Previous
1234567
Next