Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2005.08100
Cited By
Conformer: Convolution-augmented Transformer for Speech Recognition
16 May 2020
Anmol Gulati
James Qin
Chung-Cheng Chiu
Niki Parmar
Yu Zhang
Jiahui Yu
Wei Han
Shibo Wang
Zhengdong Zhang
Yonghui Wu
Ruoming Pang
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Conformer: Convolution-augmented Transformer for Speech Recognition"
50 / 1,744 papers shown
Title
Study of positional encoding approaches for Audio Spectrogram Transformers
L. Pepino
Pablo Riera
Luciana Ferrer
ViT
12
6
0
13 Oct 2021
On Language Model Integration for RNN Transducer based Speech Recognition
Wei Zhou
Zuoyun Zheng
Ralf Schluter
Hermann Ney
24
22
0
13 Oct 2021
A Melody-Unsupervision Model for Singing Voice Synthesis
Soonbeom Choi
Juhan Nam
21
13
0
13 Oct 2021
All-neural beamformer for continuous speech separation
Zhuohuang Zhang
Takuya Yoshioka
Naoyuki Kanda
Zhuo Chen
Xiaofei Wang
Dongmei Wang
Sefik Emre Eskimez
25
15
0
13 Oct 2021
Speech Summarization using Restricted Self-Attention
Roshan S. Sharma
Shruti Palaskar
A. Black
Florian Metze
17
33
0
12 Oct 2021
Multi-Modal Pre-Training for Automated Speech Recognition
David M. Chan
Shalini Ghosh
D. Chakrabarty
Björn Hoffmeister
SSL
22
16
0
12 Oct 2021
VarArray: Array-Geometry-Agnostic Continuous Speech Separation
Takuya Yoshioka
Xiaofei Wang
Dongmei Wang
M. Tang
Zirun Zhu
Zhuo Chen
Naoyuki Kanda
13
37
0
12 Oct 2021
LightSeq2: Accelerated Training for Transformer-based Models on GPUs
Xiaohui Wang
Yang Wei
Ying Xiong
Guyue Huang
Xian Qian
Yufei Ding
Mingxuan Wang
Lei Li
VLM
6
29
0
12 Oct 2021
Partial Variable Training for Efficient On-Device Federated Learning
Tien-Ju Yang
Dhruv Guliani
F. Beaufays
Giovanni Motta
FedML
11
25
0
11 Oct 2021
SRU++: Pioneering Fast Recurrence with Attention for Speech Recognition
Jing Pan
Tao Lei
Kwangyoun Kim
Kyu Jeong Han
Shinji Watanabe
VLM
26
9
0
11 Oct 2021
Interactive Feature Fusion for End-to-End Noise-Robust Speech Recognition
Yuchen Hu
Nana Hou
Chen Chen
Chng Eng Siong
11
39
0
11 Oct 2021
A Comparative Study on Non-Autoregressive Modelings for Speech-to-Text Generation
Yosuke Higuchi
Nanxin Chen
Yuya Fujita
H. Inaguma
Tatsuya Komatsu
Jaesong Lee
Jumon Nozaki
Tianzi Wang
Shinji Watanabe
22
41
0
11 Oct 2021
Advancing Momentum Pseudo-Labeling with Conformer and Initialization Strategy
Yosuke Higuchi
Niko Moritz
Jonathan Le Roux
Takaaki Hori
11
11
0
11 Oct 2021
Have best of both worlds: two-pass hybrid and E2E cascading framework for speech recognition
Guoli Ye
V. Mazalov
Jinyu Li
Y. Gong
12
9
0
10 Oct 2021
Universal Paralinguistic Speech Representations Using Self-Supervised Conformers
Joel Shor
A. Jansen
Wei Han
Daniel S. Park
Yu Zhang
SSL
AI4TS
33
54
0
09 Oct 2021
An Exploration of Self-Supervised Pretrained Representations for End-to-End Speech Recognition
Xuankai Chang
Takashi Maekaku
Pengcheng Guo
Jing Shi
Yen-Ju Lu
...
Tianzi Wang
Shu-Wen Yang
Yu Tsao
Hung-yi Lee
Shinji Watanabe
SSL
AI4TS
16
81
0
09 Oct 2021
Data Augmentation with Locally-time Reversed Speech for Automatic Speech Recognition
Si-Ioi Ng
Tan Lee
17
2
0
09 Oct 2021
TitaNet: Neural Model for speaker representation with 1D Depth-wise separable convolutions and global context
Nithin Rao Koluguri
Taejin Park
Boris Ginsburg
ViT
20
92
0
08 Oct 2021
Hybrid Random Features
K. Choromanski
Haoxian Chen
Han Lin
Yuanzhe Ma
Arijit Sehanobish
...
Andy Zeng
Valerii Likhosherstov
Dmitry Kalashnikov
Vikas Sindhwani
Adrian Weller
12
21
0
08 Oct 2021
Exploring Heterogeneous Characteristics of Layers in ASR Models for More Efficient Training
Lillian Zhou
Dhruv Guliani
Andreas Kabel
Giovanni Motta
F. Beaufays
10
1
0
08 Oct 2021
Hierarchical Conditional End-to-End ASR with CTC and Multi-Granular Subword Units
Yosuke Higuchi
Keita Karube
Tetsuji Ogawa
Tetsunori Kobayashi
13
22
0
08 Oct 2021
Improving Pseudo-label Training For End-to-end Speech Recognition Using Gradient Mask
Shaoshi Ling
Chen Shen
Meng Cai
Zejun Ma
VLM
SSL
22
8
0
08 Oct 2021
Input Length Matters: Improving RNN-T and MWER Training for Long-form Telephony Speech Recognition
Zhiyun Lu
Yanwei Pan
Thibault Doutre
Parisa Haghani
Liangliang Cao
Rohit Prabhavalkar
C. Zhang
Trevor Strohman
AuLLM
72
14
0
08 Oct 2021
Streaming Transformer Transducer Based Speech Recognition Using Non-Causal Convolution
Yangyang Shi
Chunyang Wu
Dilin Wang
Alex Xiao
Jay Mahadeokar
...
Ke Li
Yuan Shangguan
Varun K. Nagaraja
Ozlem Kalinli
M. Seltzer
20
15
0
07 Oct 2021
Predictive Maintenance for General Aviation Using Convolutional Transformers
Hong Yang
Aidan P. LaBella
Travis J. Desell
AI4TS
23
5
0
07 Oct 2021
Enabling On-Device Training of Speech Recognition Models with Federated Dropout
Dhruv Guliani
Lillian Zhou
Changwan Ryu
Tien-Ju Yang
Harry Zhang
Yong Xiao
F. Beaufays
Giovanni Motta
FedML
17
16
0
07 Oct 2021
WenetSpeech: A 10000+ Hours Multi-domain Mandarin Corpus for Speech Recognition
Binbin Zhang
Hang Lv
Pengcheng Guo
Qijie Shao
Chao Yang
...
Hui Bu
Xiaoyu Chen
Chenchen Zeng
Di Wu
Zhendong Peng
17
217
0
07 Oct 2021
Cloning one's voice using very limited data in the wild
Dongyang Dai
Yuan-Jui Chen
Li Chen
Ming Tu
Lu Liu
Rui Xia
Qiao Tian
Yuping Wang
Yuxuan Wang
SyDa
17
9
0
07 Oct 2021
Back from the future: bidirectional CTC decoding using future information in speech recognition
Namkyu Jung
Geon-min Kim
Han-Gyu Kim
23
3
0
07 Oct 2021
Transcribe-to-Diarize: Neural Speaker Diarization for Unlimited Number of Speakers using End-to-End Speaker-Attributed ASR
Naoyuki Kanda
Xiong Xiao
Yashesh Gaur
Xiaofei Wang
Zhong Meng
Zhuo Chen
Takuya Yoshioka
19
34
0
07 Oct 2021
CTC Variations Through New WFST Topologies
A. Laptev
Somshubra Majumdar
Boris Ginsburg
27
20
0
06 Oct 2021
Spell my name: keyword boosted speech recognition
Namkyu Jung
Geon-min Kim
Joon Son Chung
38
13
0
06 Oct 2021
Language Modeling using LMUs: 10x Better Data Efficiency or Improved Scaling Compared to Transformers
Narsimha Chilkuri
Eric Hunsberger
Aaron R. Voelker
G. Malik
C. Eliasmith
30
7
0
05 Oct 2021
S2 Reducer: High-Performance Sparse Communication to Accelerate Distributed Deep Learning
Ke-shi Ge
Yongquan Fu
Zhiquan Lai
Xiaoge Deng
Dongsheng Li
15
2
0
05 Oct 2021
Sound Event Detection Transformer: An Event-based End-to-End Model for Sound Event Detection
Zhi-qin Ye
Xiangdong Wang
Hong Liu
Yueliang Qian
Ruijie Tao
Long Yan
Kazushige Ouchi
ViT
27
15
0
05 Oct 2021
ASR Rescoring and Confidence Estimation with ELECTRA
Hayato Futami
H. Inaguma
Masato Mimura
S. Sakai
Tatsuya Kawahara
KELM
54
20
0
05 Oct 2021
Fast Contextual Adaptation with Neural Associative Memory for On-Device Personalized Speech Recognition
Tsendsuren Munkhdalai
K. Sim
Angad Chandorkar
Fan Gao
Mason Chua
Trevor Strohman
F. Beaufays
29
34
0
05 Oct 2021
Towards efficient end-to-end speech recognition with biologically-inspired neural networks
Thomas Bohnstingl
Ayush Garg
Stanislaw Wo'zniak
G. Saon
E. Eleftheriou
A. Pantazi
16
5
0
04 Oct 2021
Speech Technology for Everyone: Automatic Speech Recognition for Non-Native English with Transfer Learning
Toshiko Shibano
Xinyi Zhang
Miao Li
Haejin Cho
Peter Sullivan
Muhammad Abdul-Mageed
VLM
36
17
0
01 Oct 2021
Large-scale ASR Domain Adaptation using Self- and Semi-supervised Learning
DongSeon Hwang
Ananya Misra
Zhouyuan Huo
Nikhil Siddhartha
Shefali Garg
David Qiu
K. Sim
Trevor Strohman
F. Beaufays
Yanzhang He
55
34
0
01 Oct 2021
Incremental Layer-wise Self-Supervised Learning for Efficient Speech Domain Adaptation On Device
Zhouyuan Huo
Dong-Gyo Hwang
K. Sim
Shefali Garg
Ananya Misra
Nikhil Siddhartha
Trevor Strohman
Franccoise Beaufays
46
7
0
01 Oct 2021
SpliceOut: A Simple and Efficient Audio Augmentation Method
Arjit Jain
Pranay Reddy Samala
Deepak Mittal
P. Jyothi
M. Singh
20
10
0
30 Sep 2021
Multi Scale Graph Wavenet for Wind Speed Forecasting
Neetesh Rathore
Pradeep Rathore
Arghya Basak
S. Nistala
Venkataramana Runkana
AI4TS
69
18
0
30 Sep 2021
FastCorrect 2: Fast Error Correction on Multiple Candidates for Automatic Speech Recognition
Yichong Leng
Xu Tan
Rui Wang
Linchen Zhu
Jin Xu
...
Linquan Liu
Tao Qin
Xiang-Yang Li
Ed Lin
Tie-Yan Liu
27
40
0
29 Sep 2021
BigSSL: Exploring the Frontier of Large-Scale Semi-Supervised Learning for Automatic Speech Recognition
Yu Zhang
Daniel S. Park
Wei Han
James Qin
Anmol Gulati
...
Zhifeng Chen
Quoc V. Le
Chung-Cheng Chiu
Ruoming Pang
Yonghui Wu
SSL
19
175
0
27 Sep 2021
Fast-MD: Fast Multi-Decoder End-to-End Speech Translation with Non-Autoregressive Hidden Intermediates
H. Inaguma
Siddharth Dalmia
Brian Yan
Shinji Watanabe
57
11
0
27 Sep 2021
ChannelAugment: Improving generalization of multi-channel ASR by training with input channel randomization
M. Gaudesi
F. Weninger
D. Sharma
P. Zhan
AAML
19
1
0
23 Sep 2021
Audiomer: A Convolutional Transformer For Keyword Spotting
Surya Kant Sahu
Sai Mitheran
Juhi Kamdar
Meet Gandhi
32
8
0
21 Sep 2021
Audio-Visual Speech Recognition is Worth 32
×
\times
×
32
×
\times
×
8 Voxels
Dmitriy Serdyuk
Otavio Braga
Olivier Siohan
ViT
21
7
0
20 Sep 2021
Influence of ASR and Language Model on Alzheimer's Disease Detection
Joan Codina-Filbà
Guillermo Cámbara
Jordi Luque
Mireia Farrús
11
2
0
20 Sep 2021
Previous
1
2
3
...
30
31
32
33
34
35
Next