ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1610.03022
  4. Cited By
Very Deep Convolutional Networks for End-to-End Speech Recognition

Very Deep Convolutional Networks for End-to-End Speech Recognition

10 October 2016
Yu Zhang
William Chan
Navdeep Jaitly
    AI4TS
ArXivPDFHTML

Papers citing "Very Deep Convolutional Networks for End-to-End Speech Recognition"

50 / 54 papers shown
Title
Automatic speech recognition for the Nepali language using CNN,
  bidirectional LSTM and ResNet
Automatic speech recognition for the Nepali language using CNN, bidirectional LSTM and ResNet
Manish Dhakal
Arman Chhetri
Aman Kumar Gupta
Prabin B. Lamichhane
S. Pandey
S. Shakya
AI4TS
35
10
0
25 Jun 2024
TS-ENAS:Two-Stage Evolution for Cell-based Network Architecture Search
TS-ENAS:Two-Stage Evolution for Cell-based Network Architecture Search
Juan Zou
Shenghong Wu
Yizhang Xia
Weiwei Jiang
Zeping Wu
Jinhua Zheng
3DV
26
0
0
14 Oct 2023
Kernel Limit of Recurrent Neural Networks Trained on Ergodic Data
  Sequences
Kernel Limit of Recurrent Neural Networks Trained on Ergodic Data Sequences
Samuel Chun-Hei Lam
Justin A. Sirignano
K. Spiliopoulos
30
2
0
28 Aug 2023
Sequential Estimation of Gaussian Process-based Deep State-Space Models
Sequential Estimation of Gaussian Process-based Deep State-Space Models
Yuhao Liu
Marzieh Ajirak
Petar M. Djurić
26
12
0
29 Jan 2023
Improving Semi-supervised End-to-end Automatic Speech Recognition using
  CycleGAN and Inter-domain Losses
Improving Semi-supervised End-to-end Automatic Speech Recognition using CycleGAN and Inter-domain Losses
C. Li
Ngoc Thang Vu
21
2
0
20 Oct 2022
ConvRNN-T: Convolutional Augmented Recurrent Neural Network Transducers
  for Streaming Speech Recognition
ConvRNN-T: Convolutional Augmented Recurrent Neural Network Transducers for Streaming Speech Recognition
Martin H. Radfar
Rohit Barnwal
R. Swaminathan
Feng-Ju Chang
Grant P. Strimel
Nathan Susanj
Athanasios Mouchtaris
34
13
0
29 Sep 2022
Prostate Cancer Malignancy Detection and localization from mpMRI using
  auto-Deep Learning: One Step Closer to Clinical Utilization
Prostate Cancer Malignancy Detection and localization from mpMRI using auto-Deep Learning: One Step Closer to Clinical Utilization
Weiwei Zong
Eric N Carver
Simeng Zhu
E. Schaff
Daniel Chapman
...
I. Chetty
B. Movsas
W. Wen
Tarik K. Alafif
X. Zong
MedIm
23
4
0
13 Jun 2022
A Complementary Joint Training Approach Using Unpaired Speech and Text
  for Low-Resource Automatic Speech Recognition
A Complementary Joint Training Approach Using Unpaired Speech and Text for Low-Resource Automatic Speech Recognition
Ye Du
Jie Zhang
Qiu-shi Zhu
Lirong Dai
Ming Wu
Xin Fang
Zhouwang Yang
34
2
0
05 Apr 2022
Dynamic Latency for CTC-Based Streaming Automatic Speech Recognition
  With Emformer
Dynamic Latency for CTC-Based Streaming Automatic Speech Recognition With Emformer
J. Sun
Guiping Zhong
Dinghao Zhou
Baoxiang Li
21
0
0
29 Mar 2022
Adversarial Attacks on Speech Recognition Systems for Mission-Critical
  Applications: A Survey
Adversarial Attacks on Speech Recognition Systems for Mission-Critical Applications: A Survey
Ngoc Dung Huynh
Mohamed Reda Bouadjenek
Imran Razzak
Kevin Lee
Chetan Arora
Ali Hassani
A. Zaslavsky
AAML
34
6
0
22 Feb 2022
Polyphonic pitch detection with convolutional recurrent neural networks
Polyphonic pitch detection with convolutional recurrent neural networks
Carl Thomé
Sven Ahlback
25
8
0
04 Feb 2022
A Unified Speaker Adaptation Approach for ASR
A Unified Speaker Adaptation Approach for ASR
Yingzhu Zhao
Chongjia Ni
C. Leung
Chenyu You
Chng Eng Siong
B. Ma
CLL
92
9
0
16 Oct 2021
Homogeneous Architecture Augmentation for Neural Predictor
Homogeneous Architecture Augmentation for Neural Predictor
Yuqiao Liu
Yehui Tang
Yizhou Sun
34
22
0
28 Jul 2021
Efficient Weight factorization for Multilingual Speech Recognition
Efficient Weight factorization for Multilingual Speech Recognition
Ngoc-Quan Pham
Tuan-Nam Nguyen
S. Stueker
A. Waibel
43
19
0
07 May 2021
A Whole Brain Probabilistic Generative Model: Toward Realizing Cognitive
  Architectures for Developmental Robots
A Whole Brain Probabilistic Generative Model: Toward Realizing Cognitive Architectures for Developmental Robots
T. Taniguchi
Hiroshi Yamakawa
Takayuki Nagai
Kenji Doya
M. Sakagami
Masahiro Suzuki
Tomoaki Nakamura
Akira Taniguchi
28
23
0
15 Mar 2021
End-to-End Neural Systems for Automatic Children Speech Recognition: An
  Empirical Study
End-to-End Neural Systems for Automatic Children Speech Recognition: An Empirical Study
Prashanth Gurunath Shivakumar
Shrikanth Narayanan
22
48
0
19 Feb 2021
A Survey on Deep Reinforcement Learning for Audio-Based Applications
A Survey on Deep Reinforcement Learning for Audio-Based Applications
S. Latif
Heriberto Cuayáhuitl
Farrukh Pervez
Fahad Shamshad
Hafiz Shehbaz Ali
Min Zhang
OffRL
60
73
0
01 Jan 2021
Phone Features Improve Speech Translation
Phone Features Improve Speech Translation
Elizabeth Salesky
A. Black
30
27
0
27 May 2020
ContextNet: Improving Convolutional Neural Networks for Automatic Speech
  Recognition with Global Context
ContextNet: Improving Convolutional Neural Networks for Automatic Speech Recognition with Global Context
Wei Han
Zhengdong Zhang
Yu Zhang
Jiahui Yu
Chung-Cheng Chiu
James Qin
Anmol Gulati
Ruoming Pang
Yonghui Wu
42
259
0
07 May 2020
Multiresolution and Multimodal Speech Recognition with Transformers
Multiresolution and Multimodal Speech Recognition with Transformers
Georgios Paraskevopoulos
Srinivas Parthasarathy
Aparna Khare
Shiva Sundaram
25
29
0
29 Apr 2020
Imputer: Sequence Modelling via Imputation and Dynamic Programming
Imputer: Sequence Modelling via Imputation and Dynamic Programming
William Chan
Chitwan Saharia
Geoffrey E. Hinton
Mohammad Norouzi
Navdeep Jaitly
BDL
AI4TS
21
114
0
20 Feb 2020
Deep Representations for Cross-spectral Ocular Biometrics
Deep Representations for Cross-spectral Ocular Biometrics
L. A. Zanlorensi
D. Lucio
A. Britto
Hugo Manuel Proença
David Menotti
CVBM
24
25
0
21 Nov 2019
A comparison of end-to-end models for long-form speech recognition
A comparison of end-to-end models for long-form speech recognition
Chung-Cheng Chiu
Wei Han
Yu Zhang
Ruoming Pang
S. Kishchenko
...
Anjuli Kannan
Rohit Prabhavalkar
Zhehuai Chen
Tara N. Sainath
Yonghui Wu
AuLLM
36
82
0
06 Nov 2019
Correction of Automatic Speech Recognition with Transformer
  Sequence-to-sequence Model
Correction of Automatic Speech Recognition with Transformer Sequence-to-sequence Model
Oleksii Hrinchuk
Mariya Popova
Boris Ginsburg
VLM
20
87
0
23 Oct 2019
Transformer-based Acoustic Modeling for Hybrid Speech Recognition
Transformer-based Acoustic Modeling for Hybrid Speech Recognition
Yongqiang Wang
Abdel-rahman Mohamed
Duc Le
Chunxi Liu
Alex Xiao
...
Xiaohui Zhang
Frank Zhang
Christian Fuegen
Geoffrey Zweig
M. Seltzer
16
248
0
22 Oct 2019
Espresso: A Fast End-to-end Neural Speech Recognition Toolkit
Espresso: A Fast End-to-end Neural Speech Recognition Toolkit
Yiming Wang
Tongfei Chen
Hainan Xu
Shuoyang Ding
Hang Lv
Yiwen Shao
Nanyun Peng
Lei Xie
Shinji Watanabe
Sanjeev Khudanpur
VLM
33
73
0
18 Sep 2019
Focal Loss based Residual Convolutional Neural Network for Speech Emotion Recognition
Focal Loss based Residual Convolutional Neural Network for Speech Emotion Recognition
Suraj Tripathi
Abhay Kumar
A. Ramesh
Chirag Singh
Promod Yenigalla
17
13
0
11 Jun 2019
Acoustic-to-Word Models with Conversational Context Information
Acoustic-to-Word Models with Conversational Context Information
Suyoun Kim
Florian Metze
22
7
0
21 May 2019
Deep Learning for Audio Signal Processing
Deep Learning for Audio Signal Processing
Hendrik Purwins
Bo-wen Li
Tuomas Virtanen
Jan Schlüter
Shuo-yiin Chang
Tara N. Sainath
VLM
26
587
0
30 Apr 2019
Very Deep Self-Attention Networks for End-to-End Speech Recognition
Very Deep Self-Attention Networks for End-to-End Speech Recognition
Ngoc-Quan Pham
T. Nguyen
Jan Niehues
Markus Müller
Sebastian Stüker
A. Waibel
28
161
0
30 Apr 2019
From Semi-supervised to Almost-unsupervised Speech Recognition with
  Very-low Resource by Jointly Learning Phonetic Structures from Audio and Text
  Embeddings
From Semi-supervised to Almost-unsupervised Speech Recognition with Very-low Resource by Jointly Learning Phonetic Structures from Audio and Text Embeddings
Yi-Chen Chen
Sung-Feng Huang
Hung-yi Lee
Lin-Shan Lee
SSL
19
0
0
10 Apr 2019
Parrotron: An End-to-End Speech-to-Speech Conversion Model and its
  Applications to Hearing-Impaired Speech and Speech Separation
Parrotron: An End-to-End Speech-to-Speech Conversion Model and its Applications to Hearing-Impaired Speech and Speech Separation
Fadi Biadsy
Ron J. Weiss
Pedro J. Moreno
D. Kanvesky
Ye Jia
27
113
0
08 Apr 2019
Mean Field Analysis of Deep Neural Networks
Mean Field Analysis of Deep Neural Networks
Justin A. Sirignano
K. Spiliopoulos
22
81
0
11 Mar 2019
A spelling correction model for end-to-end speech recognition
A spelling correction model for end-to-end speech recognition
Jinxi Guo
Tara N. Sainath
Ron J. Weiss
AuLLM
KELM
32
139
0
19 Feb 2019
Multi-encoder multi-resolution framework for end-to-end speech
  recognition
Multi-encoder multi-resolution framework for end-to-end speech recognition
Ruizhi Li
Xiaofei Wang
Sri Harish Reddy Mallidi
Takaaki Hori
Shinji Watanabe
H. Hermansky
22
13
0
12 Nov 2018
In-the-wild Facial Expression Recognition in Extreme Poses
In-the-wild Facial Expression Recognition in Extreme Poses
Fei Yang
Qian Zhang
S. Mukund
Guoping Qiu
CVBM
33
4
0
06 Nov 2018
Temporal Convolutional Memory Networks for Remaining Useful Life
  Estimation of Industrial Machinery
Temporal Convolutional Memory Networks for Remaining Useful Life Estimation of Industrial Machinery
Lahiru Jayasinghe
Tharaka Samarasinghe
Chau Yuen
Jenny Chen Ni Low
S. Ge
19
64
0
12 Oct 2018
An Exploration of Mimic Architectures for Residual Network Based
  Spectral Mapping
An Exploration of Mimic Architectures for Residual Network Based Spectral Mapping
Peter William VanHarn Plantinga
Deblin Bagchi
Eric Fosler-Lussier
75
10
0
25 Sep 2018
Identifying the sentiment styles of YouTube's vloggers
Identifying the sentiment styles of YouTube's vloggers
Bennett Kleinberg
Maximilian Mozes
Isabelle van der Vegt
19
15
0
29 Aug 2018
Dialog-context aware end-to-end speech recognition
Dialog-context aware end-to-end speech recognition
Suyoun Kim
Florian Metze
24
47
0
07 Aug 2018
Fast ASR-free and almost zero-resource keyword spotting using DTW and
  CNNs for humanitarian monitoring
Fast ASR-free and almost zero-resource keyword spotting using DTW and CNNs for humanitarian monitoring
Raghav Menon
Herman Kamper
John Quinn
T. Niesler
24
28
0
25 Jun 2018
Extending Recurrent Neural Aligner for Streaming End-to-End Speech
  Recognition in Mandarin
Extending Recurrent Neural Aligner for Streaming End-to-End Speech Recognition in Mandarin
Linhao Dong
Shiyu Zhou
Wei Chen
Bo Xu
24
22
0
17 Jun 2018
Adversarial adaptive 1-D convolutional neural networks for bearing fault
  diagnosis under varying working condition
Adversarial adaptive 1-D convolutional neural networks for bearing fault diagnosis under varying working condition
Bo Zhang
Wei Li
Jie Hao
Xiao-Li Li
Meng Zhang
24
53
0
01 May 2018
Graph2Seq: Graph to Sequence Learning with Attention-based Neural
  Networks
Graph2Seq: Graph to Sequence Learning with Attention-based Neural Networks
Kun Xu
Lingfei Wu
Zhiguo Wang
Yansong Feng
Michael Witbrock
V. Sheinin
GNN
25
172
0
03 Apr 2018
ESPnet: End-to-End Speech Processing Toolkit
ESPnet: End-to-End Speech Processing Toolkit
Shinji Watanabe
Takaaki Hori
Shigeki Karita
Tomoki Hayashi
Jiro Nishitoba
...
Jahn Heymann
Sanjeev Khudanpur
Nanxin Chen
Adithya Renduchintala
Tsubasa Ochiai
VLM
46
1,481
0
30 Mar 2018
Self-Attentional Acoustic Models
Self-Attentional Acoustic Models
Matthias Sperber
Jan Niehues
Graham Neubig
Sebastian Stüker
A. Waibel
22
151
0
26 Mar 2018
Multi-Dialect Speech Recognition With A Single Sequence-To-Sequence
  Model
Multi-Dialect Speech Recognition With A Single Sequence-To-Sequence Model
Bo-wen Li
Tara N. Sainath
K. Sim
M. Bacchiani
Eugene Weinstein
Patrick Nguyen
Zhehuai Chen
Yan-Qing Wu
Kanishka Rao
26
133
0
05 Dec 2017
Spatiotemporal Modeling for Crowd Counting in Videos
Spatiotemporal Modeling for Crowd Counting in Videos
Feng Xiong
Xingjian Shi
Dit-Yan Yeung
25
184
0
25 Jul 2017
Online and Linear-Time Attention by Enforcing Monotonic Alignments
Online and Linear-Time Attention by Enforcing Monotonic Alignments
Colin Raffel
Minh-Thang Luong
Peter J. Liu
Ron J. Weiss
Douglas Eck
35
255
0
03 Apr 2017
Sequence-to-Sequence Models Can Directly Translate Foreign Speech
Sequence-to-Sequence Models Can Directly Translate Foreign Speech
Ron J. Weiss
J. Chorowski
Navdeep Jaitly
Yonghui Wu
Zhehuai Chen
33
341
0
24 Mar 2017
12
Next