Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2005.08100
Cited By
Conformer: Convolution-augmented Transformer for Speech Recognition
16 May 2020
Anmol Gulati
James Qin
Chung-Cheng Chiu
Niki Parmar
Yu Zhang
Jiahui Yu
Wei Han
Shibo Wang
Zhengdong Zhang
Yonghui Wu
Ruoming Pang
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Conformer: Convolution-augmented Transformer for Speech Recognition"
50 / 1,744 papers shown
Title
Wav-BERT: Cooperative Acoustic and Linguistic Representation Learning for Low-Resource Speech Recognition
Guolin Zheng
Yubei Xiao
Ke Gong
Pan Zhou
Xiaodan Liang
Liang Lin
16
26
0
19 Sep 2021
Dual-Encoder Architecture with Encoder Selection for Joint Close-Talk and Far-Talk Speech Recognition
F. Weninger
M. Gaudesi
Ralf Leibold
R. Gemello
P. Zhan
18
4
0
17 Sep 2021
Primer: Searching for Efficient Transformers for Language Modeling
David R. So
Wojciech Mañke
Hanxiao Liu
Zihang Dai
Noam M. Shazeer
Quoc V. Le
VLM
83
152
0
17 Sep 2021
PDAugment: Data Augmentation by Pitch and Duration Adjustments for Automatic Lyrics Transcription
Chen Zhang
Jiaxing Yu
Luchin Chang
Xu Tan
Jiawei Chen
Tao Qin
Kecheng Zhang
18
15
0
16 Sep 2021
Tied & Reduced RNN-T Decoder
Rami Botros
Tara N. Sainath
R. David
Emmanuel Guzman
Wei Li
Yanzhang He
17
55
0
15 Sep 2021
Performance-Efficiency Trade-offs in Unsupervised Pre-training for Speech Recognition
Felix Wu
Kwangyoun Kim
Jing Pan
Kyu Jeong Han
Kilian Q. Weinberger
Yoav Artzi
25
71
0
14 Sep 2021
Non-autoregressive End-to-end Speech Translation with Parallel Autoregressive Rescoring
H. Inaguma
Yosuke Higuchi
Kevin Duh
Tatsuya Kawahara
Shinji Watanabe
43
11
0
09 Sep 2021
Beijing ZKJ-NPU Speaker Verification System for VoxCeleb Speaker Recognition Challenge 2021
Li Lyna Zhang
Huan Zhao
Qinling Meng
Yanli Chen
Min Liu
Lei Xie
22
10
0
08 Sep 2021
A Survey of Sound Source Localization with Deep Learning Methods
Pierre-Amaury Grumiaux
Srdjan Kitić
Laurent Girin
Alexandre Guérin
17
246
0
08 Sep 2021
Efficient conformer: Progressive downsampling and grouped attention for automatic speech recognition
Maxime Burchi
Valentin Vielzeuf
21
83
0
31 Aug 2021
Multi-Channel Transformer Transducer for Speech Recognition
Feng-Ju Chang
Martin H. Radfar
Athanasios Mouchtaris
M. Omologo
16
19
0
30 Aug 2021
Injecting Text in Self-Supervised Speech Pretraining
Zhehuai Chen
Yu Zhang
Andrew Rosenberg
Bhuvana Ramabhadran
Gary Wang
Pedro J. Moreno
SSL
17
36
0
27 Aug 2021
Self-Attention for Audio Super-Resolution
Nathanaël Carraz Rakotonirina
SupR
20
23
0
26 Aug 2021
Multilingual Speech Recognition for Low-Resource Indian Languages using Multi-Task conformer
Krishna D N Freshworks
19
7
0
22 Aug 2021
A Dual-Decoder Conformer for Multilingual Speech Recognition
Krishna D N Freshworks
4
1
0
22 Aug 2021
Generalizing RNN-Transducer to Out-Domain Audio via Sparse Self-Attention Layers
Juntae Kim
Jee-Hye Lee
16
6
0
22 Aug 2021
Towards Efficient Point Cloud Graph Neural Networks Through Architectural Simplification
Shyam A. Tailor
R. D. Jong
Tiago Azevedo
Matthew Mattina
Partha P. Maji
3DPC
GNN
11
12
0
13 Aug 2021
Masked Acoustic Unit for Mispronunciation Detection and Correction
Zhan Zhang
Yuehai Wang
Jianyi Yang
20
3
0
12 Aug 2021
W2v-BERT: Combining Contrastive Learning and Masked Language Modeling for Self-Supervised Speech Pre-Training
Yu-An Chung
Yu Zhang
Wei Han
Chung-Cheng Chiu
James Qin
Ruoming Pang
Yonghui Wu
SSL
VLM
12
410
0
07 Aug 2021
Dyn-ASR: Compact, Multilingual Speech Recognition via Spoken Language and Accent Identification
Sangeeta Ghangam
Daniel Whitenack
Joshua Nemecek
6
4
0
04 Aug 2021
Decoupling recognition and transcription in Mandarin ASR
Jiahong Yuan
Xingyu Cai
Dongji Gao
Renjie Zheng
Liang Huang
Kenneth Ward Church
28
9
0
02 Aug 2021
USC: An Open-Source Uzbek Speech Corpus and Initial Speech Recognition Experiments
M. Musaev
Saida Mussakhojayeva
Ilyos Khujayorov
Yerbolat Khassanov
M. Ochilov
H. A. Varol
11
19
0
30 Jul 2021
Proposal-based Few-shot Sound Event Detection for Speech and Environmental Sounds with Perceivers
Piper Wolters
Logan Sizemore
Chris Daw
Brian Hutchinson
Lauren A. Phillips
27
11
0
28 Jul 2021
CarneliNet: Neural Mixture Model for Automatic Speech Recognition
A. Kalinov
Somshubra Majumdar
Jagadeesh Balam
Boris Ginsburg
MoE
15
3
0
22 Jul 2021
Multitask-Based Joint Learning Approach To Robust ASR For Radio Communication Speech
Duo Ma
Nana Hou
Van Tung Pham
Haihua Xu
Chng Eng Siong
17
22
0
22 Jul 2021
Streaming End-to-End ASR based on Blockwise Non-Autoregressive Models
Tianzi Wang
Yuya Fujita
Xuankai Chang
Shinji Watanabe
11
15
0
20 Jul 2021
Assessment of Self-Attention on Learned Features For Sound Event Localization and Detection
Parthasaarathy Sudarsanam
A. Politis
K. Drossos
9
13
0
20 Jul 2021
Translatotron 2: High-quality direct speech-to-speech translation with voice preservation
Ye Jia
Michelle Tadmor Ramanovich
Tal Remez
Roi Pomerantz
26
67
0
19 Jul 2021
VAD-free Streaming Hybrid CTC/Attention ASR for Unsegmented Recording
H. Inaguma
Tatsuya Kawahara
15
2
0
15 Jul 2021
Conformer-based End-to-end Speech Recognition With Rotary Position Embedding
Shengqiang Li
Menglong Xu
Xiao-Lei Zhang
13
9
0
13 Jul 2021
Speech Representation Learning Combining Conformer CPC with Deep Cluster for the ZeroSpeech Challenge 2021
Takashi Maekaku
Xuankai Chang
Yuya Fujita
Li-Wei Chen
Shinji Watanabe
Alexander I. Rudnicky
104
13
0
13 Jul 2021
Dropout Regularization for Self-Supervised Learning of Transformer Encoder Speech Representation
Jian Luo
Jianzong Wang
Ning Cheng
Jing Xiao
SSL
16
6
0
09 Jul 2021
On lattice-free boosted MMI training of HMM and CTC-based full-context ASR models
Xiaohui Zhang
Vimal Manohar
David C. Zhang
Frank Zhang
Yangyang Shi
Nayan Singhal
Julian Chan
Fuchun Peng
Yatharth Saraf
M. Seltzer
12
14
0
09 Jul 2021
Improved Language Identification Through Cross-Lingual Self-Supervised Learning
Andros Tjandra
Diptanu Gon Choudhury
Frank Zhang
Kritika Singh
Alexis Conneau
Alexei Baevski
Assaf Sela
Yatharth Saraf
Michael Auli
VLM
SSL
21
35
0
08 Jul 2021
Advancing CTC-CRF Based End-to-End Speech Recognition with Wordpieces and Conformers
Huahuan Zheng
Wenjie Peng
Zhijian Ou
Jinsong Zhang
10
5
0
07 Jul 2021
GLiT: Neural Architecture Search for Global and Local Image Transformer
Boyu Chen
Peixia Li
Chuming Li
Baopu Li
Lei Bai
Chen Lin
Ming-hui Sun
Junjie Yan
Wanli Ouyang
ViT
24
85
0
07 Jul 2021
A Comparative Study of Modular and Joint Approaches for Speaker-Attributed ASR on Monaural Long-Form Audio
Naoyuki Kanda
Xiong Xiao
Jian Wu
Tianyan Zhou
Yashesh Gaur
Xiaofei Wang
Zhong Meng
Zhuo Chen
Takuya Yoshioka
6
13
0
06 Jul 2021
The NiuTrans End-to-End Speech Translation System for IWSLT 2021 Offline Task
Chen Xu
Xiaoqian Liu
Xiaowen Liu
Laohu Wang
Canan Huang
Tong Xiao
Jingbo Zhu
23
5
0
06 Jul 2021
Investigation of Practical Aspects of Single Channel Speech Separation for ASR
Jian Wu
Zhuo Chen
Sanyuan Chen
Yu-Huan Wu
Takuya Yoshioka
Naoyuki Kanda
Shujie Liu
Jinyu Li
14
17
0
05 Jul 2021
Relaxed Attention: A Simple Method to Boost Performance of End-to-End Automatic Speech Recognition
Timo Lohrenz
P. Schwarz
Zhengyang Li
Tim Fingscheidt
13
11
0
02 Jul 2021
Dual Causal/Non-Causal Self-Attention for Streaming End-to-End Speech Recognition
Niko Moritz
Takaaki Hori
Jonathan Le Roux
6
20
0
02 Jul 2021
ESPnet-ST IWSLT 2021 Offline Speech Translation System
H. Inaguma
Shun Kiyono
Nelson Enrique Yalta Soplin
Pengcheng Guo
Jun Suzuki
Kevin Duh
Shinji Watanabe
3DV
32
2
0
01 Jul 2021
StableEmit: Selection Probability Discount for Reducing Emission Latency of Streaming Monotonic Attention ASR
H. Inaguma
Tatsuya Kawahara
17
4
0
01 Jul 2021
IMS' Systems for the IWSLT 2021 Low-Resource Speech Translation Task
Pavel Denisov
Manuel Mager
Ngoc Thang Vu
27
6
0
30 Jun 2021
DF-Conformer: Integrated architecture of Conv-TasNet and Conformer using linear complexity self-attention for speech enhancement
Yuma Koizumi
Shigeki Karita
Scott Wisdom
Hakan Erdogan
J. Hershey
Llion Jones
M. Bacchiani
19
41
0
30 Jun 2021
N-Singer: A Non-Autoregressive Korean Singing Voice Synthesis System for Pronunciation Enhancement
Gyeong-Hoon Lee
Tae-Woo Kim
Hanbin Bae
Min-Ji Lee
Young-Ik Kim
Hoon-Young Cho
VLM
6
19
0
29 Jun 2021
Stable, Fast and Accurate: Kernelized Attention with Relative Positional Encoding
Shengjie Luo
Shanda Li
Tianle Cai
Di He
Dinglan Peng
Shuxin Zheng
Guolin Ke
Liwei Wang
Tie-Yan Liu
19
50
0
23 Jun 2021
An Improved Single Step Non-autoregressive Transformer for Automatic Speech Recognition
Ruchao Fan
Wei Chu
Peng Chang
Jing Xiao
Abeer Alwan
14
15
0
18 Jun 2021
Multi-mode Transformer Transducer with Stochastic Future Context
Kwangyoun Kim
Felix Wu
Prashant Sridhar
Kyu Jeong Han
Shinji Watanabe
30
9
0
17 Jun 2021
Efficient Conformer with Prob-Sparse Attention Mechanism for End-to-EndSpeech Recognition
Xiong Wang
Sining Sun
Lei Xie
Long Ma
16
18
0
17 Jun 2021
Previous
1
2
3
...
31
32
33
34
35
Next