Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2005.08100
Cited By
Conformer: Convolution-augmented Transformer for Speech Recognition
16 May 2020
Anmol Gulati
James Qin
Chung-Cheng Chiu
Niki Parmar
Yu Zhang
Jiahui Yu
Wei Han
Shibo Wang
Zhengdong Zhang
Yonghui Wu
Ruoming Pang
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Conformer: Convolution-augmented Transformer for Speech Recognition"
50 / 1,744 papers shown
Title
Layer Pruning on Demand with Intermediate CTC
Jaesong Lee
Jingu Kang
Shinji Watanabe
11
16
0
17 Jun 2021
LiRA: Learning Visual Speech Representations from Audio through Self-supervision
Pingchuan Ma
Rodrigo Mira
Stavros Petridis
Björn W. Schuller
M. Pantic
SSL
16
53
0
16 Jun 2021
Collaborative Training of Acoustic Encoders for Speech Recognition
Varun K. Nagaraja
Yangyang Shi
Ganesh Venkatesh
Ozlem Kalinli
M. Seltzer
Vikas Chandra
24
11
0
16 Jun 2021
Multi-Speaker ASR Combining Non-Autoregressive Conformer CTC and Conditional Speaker Chain
Pengcheng Guo
Xuankai Chang
Shinji Watanabe
Lei Xie
11
18
0
16 Jun 2021
RyanSpeech: A Corpus for Conversational Text-to-Speech Synthesis
Rohola Zandie
Mohammad H. Mahoor
Julia Madsen
Eshrat S. Emamian
13
24
0
15 Jun 2021
Learning Audio-Visual Dereverberation
Changan Chen
Wei-Ju Sun
David F. Harwath
Kristen Grauman
23
31
0
14 Jun 2021
Overcoming Domain Mismatch in Low Resource Sequence-to-Sequence ASR Models using Hybrid Generated Pseudotranscripts
Chak-Fai Li
Francis Keith
William Hartmann
M. Snover
O. Kimball
9
4
0
14 Jun 2021
HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units
Wei-Ning Hsu
Benjamin Bolte
Yao-Hung Hubert Tsai
Kushal Lakhotia
Ruslan Salakhutdinov
Abdel-rahman Mohamed
SSL
20
2,762
0
14 Jun 2021
End-to-end Neural Diarization: From Transformer to Conformer
Yi Y. Liu
Eunjung Han
Chul Lee
A. Stolcke
11
40
0
14 Jun 2021
GigaSpeech: An Evolving, Multi-domain ASR Corpus with 10,000 Hours of Transcribed Audio
Guoguo Chen
Shuzhou Chai
Guan-Bo Wang
Jiayu Du
Weiqiang Zhang
...
Xuchen Yao
Yongqing Wang
Yujun Wang
Zhao You
Zhiyong Yan
17
348
0
13 Jun 2021
PARP: Prune, Adjust and Re-Prune for Self-Supervised Speech Recognition
Cheng-I Jeff Lai
Yang Zhang
Alexander H. Liu
Shiyu Chang
Yi-Lun Liao
Yung-Sung Chuang
Kaizhi Qian
Sameer Khurana
David D. Cox
James R. Glass
VLM
49
70
0
10 Jun 2021
Balanced End-to-End Monolingual pre-training for Low-Resourced Indic Languages Code-Switching Speech Recognition
A. Hussein
Shammur A. Chowdhury
Najim Dehak
Ahmed M. Ali
11
2
0
10 Jun 2021
GroupBERT: Enhanced Transformer Architecture with Efficient Grouped Structures
Ivan Chelombiev
Daniel Justus
Douglas Orr
A. Dietrich
Frithjof Gressmann
A. Koliousis
Carlo Luschi
11
5
0
10 Jun 2021
U2++: Unified Two-pass Bidirectional End-to-end Model for Speech Recognition
Di Wu
Binbin Zhang
Chao Yang
Zhendong Peng
Wenjing Xia
Xiaoyu Chen
X. Lei
13
47
0
10 Jun 2021
Audiovisual transfer learning for audio tagging and sound event detection
Wim Boes
Hugo Van hamme
CLIP
VLM
11
11
0
09 Jun 2021
Do Transformers Really Perform Bad for Graph Representation?
Chengxuan Ying
Tianle Cai
Shengjie Luo
Shuxin Zheng
Guolin Ke
Di He
Yanming Shen
Tie-Yan Liu
GNN
23
432
0
09 Jun 2021
A Comparative Study on Neural Architectures and Training Methods for Japanese Speech Recognition
Shigeki Karita
Yotaro Kubo
M. Bacchiani
Llion Jones
17
13
0
09 Jun 2021
Unsupervised Automatic Speech Recognition: A Review
Hanan Aldarmaki
Asad Ullah
Nazar Zaki
VLM
SSL
31
56
0
09 Jun 2021
A Survey of Transformers
Tianyang Lin
Yuxin Wang
Xiangyang Liu
Xipeng Qiu
ViT
27
1,084
0
08 Jun 2021
Raw Waveform Encoder with Multi-Scale Globally Attentive Locally Recurrent Networks for End-to-End Speech Recognition
Max W. Y. Lam
Jun Wang
Chao Weng
Dan Su
Dong Yu
21
6
0
08 Jun 2021
LocalTrans: A Multiscale Local Transformer Network for Cross-Resolution Homography Estimation
Ruizhi Shao
Gaochang Wu
Yuemei Zhou
Ying Fu
Yebin Liu
ViT
8
42
0
08 Jun 2021
Data Augmentation Methods for End-to-end Speech Recognition on Distant-Talk Scenarios
E. Tsunoo
Kentarou Shibata
Chaitanya Narisetty
Yosuke Kashiwagi
Shinji Watanabe
11
12
0
07 Jun 2021
CAPE: Encoding Relative Positions with Continuous Augmented Positional Embeddings
Tatiana Likhomanenko
Qiantong Xu
Gabriel Synnaeve
R. Collobert
A. Rogozhnikov
OOD
ViT
23
54
0
06 Jun 2021
Attention mechanisms and deep learning for machine vision: A survey of the state of the art
A. M. Hafiz
S. A. Parah
R. A. Bhat
11
45
0
03 Jun 2021
Dual Script E2E framework for Multilingual and Code-Switching ASR
Mari Ganesh Kumar
Jom Kuriakose
Anand Thyagachandran
A. Arunkumar
Ashish Seth
L. D. Prasad
Saish Jaiswal
Anusha Prakash
H. Murthy
27
10
0
02 Jun 2021
Towards One Model to Rule All: Multilingual Strategy for Dialectal Code-Switching Arabic ASR
Shammur A. Chowdhury
A. Hussein
Ahmed Abdelali
Ahmed M. Ali
14
33
0
31 May 2021
Cross-Referencing Self-Training Network for Sound Event Detection in Audio Mixtures
Sangwook Park
D. Han
Mounya Elhilali
14
12
0
27 May 2021
Unsupervised Speech Recognition
Alexei Baevski
Wei-Ning Hsu
Alexis Conneau
Michael Auli
SSL
12
270
0
24 May 2021
Attention-based Neural Beamforming Layers for Multi-channel Speech Recognition
Bhargav Pulugundla
Yang Gao
Brian King
Gokce Keskin
Sri Harish Reddy Mallidi
Minhua Wu
J. Droppo
Roland Maas
11
2
0
12 May 2021
Stacked Acoustic-and-Textual Encoding: Integrating the Pre-trained Models into Speech Translation Encoders
Chen Xu
Bojie Hu
Yanyang Li
Yuhao Zhang
Shen Huang
Qi Ju
Tong Xiao
Jingbo Zhu
17
75
0
12 May 2021
GSPMD: General and Scalable Parallelization for ML Computation Graphs
Yuanzhong Xu
HyoukJoong Lee
Dehao Chen
Blake A. Hechtman
Yanping Huang
...
Noam M. Shazeer
Shibo Wang
Tao Wang
Yonghui Wu
Zhifeng Chen
MoE
20
127
0
10 May 2021
FastCorrect: Fast Error Correction with Edit Alignment for Automatic Speech Recognition
Yichong Leng
Xu Tan
Linchen Zhu
Jin Xu
Renqian Luo
Linquan Liu
Tao Qin
Xiang-Yang Li
Ed Lin
Tie-Yan Liu
KELM
22
63
0
09 May 2021
SpeechNet: A Universal Modularized Model for Speech Processing Tasks
Yi-Chen Chen
Po-Han Chi
Shu-Wen Yang
Kai-Wei Chang
Jheng-hao Lin
Sung-Feng Huang
Da-Rong Liu
Chi-Liang Liu
Cheng-Kuang Lee
Hung-yi Lee
MoE
21
17
0
07 May 2021
SpeechMoE: Scaling to Large Acoustic Models with Dynamic Routing Mixture of Experts
Zhao You
Shulin Feng
Dan Su
Dong Yu
MoE
13
11
0
07 May 2021
Efficient Weight factorization for Multilingual Speech Recognition
Ngoc-Quan Pham
Tuan-Nam Nguyen
S. Stueker
A. Waibel
35
19
0
07 May 2021
On the limit of English conversational speech recognition
Zoltán Tüske
G. Saon
Brian Kingsbury
9
50
0
03 May 2021
Scaling End-to-End Models for Large-Scale Multilingual ASR
Bo-wen Li
Ruoming Pang
Tara N. Sainath
Anmol Gulati
Yu Zhang
James Qin
Parisa Haghani
W. R. Huang
Min Ma
Junwen Bai
CLL
24
76
0
30 Apr 2021
Bridging the gap between streaming and non-streaming ASR systems bydistilling ensembles of CTC and RNN-T models
Thibault Doutre
Wei Han
Chung-Cheng Chiu
Ruoming Pang
Olivier Siohan
Liangliang Cao
16
5
0
25 Apr 2021
Advanced Long-context End-to-end Speech Recognition Using Context-expanded Transformers
Takaaki Hori
Niko Moritz
Chiori Hori
Jonathan Le Roux
14
34
0
19 Apr 2021
Acoustic Data-Driven Subword Modeling for End-to-End Speech Recognition
Wei Zhou
Mohammad Zeineldeen
Zuoyun Zheng
Ralf Schluter
Hermann Ney
17
14
0
19 Apr 2021
A novel time-frequency Transformer based on self-attention mechanism and its application in fault diagnosis of rolling bearings
Yifei Ding
M. Jia
Qiuhua Miao
Yudong Cao
14
266
0
19 Apr 2021
Efficient conformer-based speech recognition with linear attention
Shengqiang Li
Menglong Xu
Xiao-Lei Zhang
13
18
0
14 Apr 2021
Non-autoregressive sequence-to-sequence voice conversion
Tomoki Hayashi
Wen-Chin Huang
Kazuhiro Kobayashi
T. Toda
6
23
0
14 Apr 2021
Source and Target Bidirectional Knowledge Distillation for End-to-end Speech Translation
H. Inaguma
Tatsuya Kawahara
Shinji Watanabe
21
42
0
13 Apr 2021
Lessons on Parameter Sharing across Layers in Transformers
Sho Takase
Shun Kiyono
11
84
0
13 Apr 2021
Improved Conformer-based End-to-End Speech Recognition Using Neural Architecture Search
Yukun Liu
Ta Li
Pengyuan Zhang
Yonghong Yan
AI4TS
11
6
0
12 Apr 2021
A Toolbox for Construction and Analysis of Speech Datasets
Evelina Bakhturina
Vitaly Lavrukhin
Boris Ginsburg
8
12
0
11 Apr 2021
Boundary and Context Aware Training for CIF-based Non-Autoregressive End-to-end ASR
Fan Yu
Haoneng Luo
Pengcheng Guo
Yuhao Liang
Zhuoyuan Yao
Lei Xie
Yingying Gao
Leijing Hou
Shilei Zhang
11
11
0
10 Apr 2021
Lookup-Table Recurrent Language Models for Long Tail Speech Recognition
W. R. Huang
Tara N. Sainath
Cal Peyser
Shankar Kumar
David Rybach
Trevor Strohman
RALM
LMTD
14
5
0
09 Apr 2021
On Architectures and Training for Raw Waveform Feature Extraction in ASR
Peter Vieting
Christoph Luscher
Wilfried Michel
Ralf Schluter
Hermann Ney
19
9
0
09 Apr 2021
Previous
1
2
3
...
32
33
34
35
Next