ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2005.08100
  4. Cited By
Conformer: Convolution-augmented Transformer for Speech Recognition

Conformer: Convolution-augmented Transformer for Speech Recognition

16 May 2020
Anmol Gulati
James Qin
Chung-Cheng Chiu
Niki Parmar
Yu Zhang
Jiahui Yu
Wei Han
Shibo Wang
Zhengdong Zhang
Yonghui Wu
Ruoming Pang
ArXivPDFHTML

Papers citing "Conformer: Convolution-augmented Transformer for Speech Recognition"

50 / 1,744 papers shown
Title
Wav-BERT: Cooperative Acoustic and Linguistic Representation Learning
  for Low-Resource Speech Recognition
Wav-BERT: Cooperative Acoustic and Linguistic Representation Learning for Low-Resource Speech Recognition
Guolin Zheng
Yubei Xiao
Ke Gong
Pan Zhou
Xiaodan Liang
Liang Lin
16
26
0
19 Sep 2021
Dual-Encoder Architecture with Encoder Selection for Joint Close-Talk
  and Far-Talk Speech Recognition
Dual-Encoder Architecture with Encoder Selection for Joint Close-Talk and Far-Talk Speech Recognition
F. Weninger
M. Gaudesi
Ralf Leibold
R. Gemello
P. Zhan
18
4
0
17 Sep 2021
Primer: Searching for Efficient Transformers for Language Modeling
Primer: Searching for Efficient Transformers for Language Modeling
David R. So
Wojciech Mañke
Hanxiao Liu
Zihang Dai
Noam M. Shazeer
Quoc V. Le
VLM
83
152
0
17 Sep 2021
PDAugment: Data Augmentation by Pitch and Duration Adjustments for
  Automatic Lyrics Transcription
PDAugment: Data Augmentation by Pitch and Duration Adjustments for Automatic Lyrics Transcription
Chen Zhang
Jiaxing Yu
Luchin Chang
Xu Tan
Jiawei Chen
Tao Qin
Kecheng Zhang
18
15
0
16 Sep 2021
Tied & Reduced RNN-T Decoder
Tied & Reduced RNN-T Decoder
Rami Botros
Tara N. Sainath
R. David
Emmanuel Guzman
Wei Li
Yanzhang He
17
55
0
15 Sep 2021
Performance-Efficiency Trade-offs in Unsupervised Pre-training for
  Speech Recognition
Performance-Efficiency Trade-offs in Unsupervised Pre-training for Speech Recognition
Felix Wu
Kwangyoun Kim
Jing Pan
Kyu Jeong Han
Kilian Q. Weinberger
Yoav Artzi
25
71
0
14 Sep 2021
Non-autoregressive End-to-end Speech Translation with Parallel
  Autoregressive Rescoring
Non-autoregressive End-to-end Speech Translation with Parallel Autoregressive Rescoring
H. Inaguma
Yosuke Higuchi
Kevin Duh
Tatsuya Kawahara
Shinji Watanabe
43
11
0
09 Sep 2021
Beijing ZKJ-NPU Speaker Verification System for VoxCeleb Speaker
  Recognition Challenge 2021
Beijing ZKJ-NPU Speaker Verification System for VoxCeleb Speaker Recognition Challenge 2021
Li Lyna Zhang
Huan Zhao
Qinling Meng
Yanli Chen
Min Liu
Lei Xie
22
10
0
08 Sep 2021
A Survey of Sound Source Localization with Deep Learning Methods
A Survey of Sound Source Localization with Deep Learning Methods
Pierre-Amaury Grumiaux
Srdjan Kitić
Laurent Girin
Alexandre Guérin
17
246
0
08 Sep 2021
Efficient conformer: Progressive downsampling and grouped attention for
  automatic speech recognition
Efficient conformer: Progressive downsampling and grouped attention for automatic speech recognition
Maxime Burchi
Valentin Vielzeuf
21
83
0
31 Aug 2021
Multi-Channel Transformer Transducer for Speech Recognition
Multi-Channel Transformer Transducer for Speech Recognition
Feng-Ju Chang
Martin H. Radfar
Athanasios Mouchtaris
M. Omologo
16
19
0
30 Aug 2021
Injecting Text in Self-Supervised Speech Pretraining
Injecting Text in Self-Supervised Speech Pretraining
Zhehuai Chen
Yu Zhang
Andrew Rosenberg
Bhuvana Ramabhadran
Gary Wang
Pedro J. Moreno
SSL
17
36
0
27 Aug 2021
Self-Attention for Audio Super-Resolution
Self-Attention for Audio Super-Resolution
Nathanaël Carraz Rakotonirina
SupR
20
23
0
26 Aug 2021
Multilingual Speech Recognition for Low-Resource Indian Languages using
  Multi-Task conformer
Multilingual Speech Recognition for Low-Resource Indian Languages using Multi-Task conformer
Krishna D N Freshworks
19
7
0
22 Aug 2021
A Dual-Decoder Conformer for Multilingual Speech Recognition
A Dual-Decoder Conformer for Multilingual Speech Recognition
Krishna D N Freshworks
4
1
0
22 Aug 2021
Generalizing RNN-Transducer to Out-Domain Audio via Sparse
  Self-Attention Layers
Generalizing RNN-Transducer to Out-Domain Audio via Sparse Self-Attention Layers
Juntae Kim
Jee-Hye Lee
16
6
0
22 Aug 2021
Towards Efficient Point Cloud Graph Neural Networks Through
  Architectural Simplification
Towards Efficient Point Cloud Graph Neural Networks Through Architectural Simplification
Shyam A. Tailor
R. D. Jong
Tiago Azevedo
Matthew Mattina
Partha P. Maji
3DPC
GNN
11
12
0
13 Aug 2021
Masked Acoustic Unit for Mispronunciation Detection and Correction
Masked Acoustic Unit for Mispronunciation Detection and Correction
Zhan Zhang
Yuehai Wang
Jianyi Yang
20
3
0
12 Aug 2021
W2v-BERT: Combining Contrastive Learning and Masked Language Modeling
  for Self-Supervised Speech Pre-Training
W2v-BERT: Combining Contrastive Learning and Masked Language Modeling for Self-Supervised Speech Pre-Training
Yu-An Chung
Yu Zhang
Wei Han
Chung-Cheng Chiu
James Qin
Ruoming Pang
Yonghui Wu
SSL
VLM
12
410
0
07 Aug 2021
Dyn-ASR: Compact, Multilingual Speech Recognition via Spoken Language
  and Accent Identification
Dyn-ASR: Compact, Multilingual Speech Recognition via Spoken Language and Accent Identification
Sangeeta Ghangam
Daniel Whitenack
Joshua Nemecek
6
4
0
04 Aug 2021
Decoupling recognition and transcription in Mandarin ASR
Decoupling recognition and transcription in Mandarin ASR
Jiahong Yuan
Xingyu Cai
Dongji Gao
Renjie Zheng
Liang Huang
Kenneth Ward Church
28
9
0
02 Aug 2021
USC: An Open-Source Uzbek Speech Corpus and Initial Speech Recognition
  Experiments
USC: An Open-Source Uzbek Speech Corpus and Initial Speech Recognition Experiments
M. Musaev
Saida Mussakhojayeva
Ilyos Khujayorov
Yerbolat Khassanov
M. Ochilov
H. A. Varol
11
19
0
30 Jul 2021
Proposal-based Few-shot Sound Event Detection for Speech and
  Environmental Sounds with Perceivers
Proposal-based Few-shot Sound Event Detection for Speech and Environmental Sounds with Perceivers
Piper Wolters
Logan Sizemore
Chris Daw
Brian Hutchinson
Lauren A. Phillips
27
11
0
28 Jul 2021
CarneliNet: Neural Mixture Model for Automatic Speech Recognition
CarneliNet: Neural Mixture Model for Automatic Speech Recognition
A. Kalinov
Somshubra Majumdar
Jagadeesh Balam
Boris Ginsburg
MoE
15
3
0
22 Jul 2021
Multitask-Based Joint Learning Approach To Robust ASR For Radio
  Communication Speech
Multitask-Based Joint Learning Approach To Robust ASR For Radio Communication Speech
Duo Ma
Nana Hou
Van Tung Pham
Haihua Xu
Chng Eng Siong
17
22
0
22 Jul 2021
Streaming End-to-End ASR based on Blockwise Non-Autoregressive Models
Streaming End-to-End ASR based on Blockwise Non-Autoregressive Models
Tianzi Wang
Yuya Fujita
Xuankai Chang
Shinji Watanabe
11
15
0
20 Jul 2021
Assessment of Self-Attention on Learned Features For Sound Event
  Localization and Detection
Assessment of Self-Attention on Learned Features For Sound Event Localization and Detection
Parthasaarathy Sudarsanam
A. Politis
K. Drossos
9
13
0
20 Jul 2021
Translatotron 2: High-quality direct speech-to-speech translation with
  voice preservation
Translatotron 2: High-quality direct speech-to-speech translation with voice preservation
Ye Jia
Michelle Tadmor Ramanovich
Tal Remez
Roi Pomerantz
26
67
0
19 Jul 2021
VAD-free Streaming Hybrid CTC/Attention ASR for Unsegmented Recording
VAD-free Streaming Hybrid CTC/Attention ASR for Unsegmented Recording
H. Inaguma
Tatsuya Kawahara
15
2
0
15 Jul 2021
Conformer-based End-to-end Speech Recognition With Rotary Position
  Embedding
Conformer-based End-to-end Speech Recognition With Rotary Position Embedding
Shengqiang Li
Menglong Xu
Xiao-Lei Zhang
13
9
0
13 Jul 2021
Speech Representation Learning Combining Conformer CPC with Deep Cluster
  for the ZeroSpeech Challenge 2021
Speech Representation Learning Combining Conformer CPC with Deep Cluster for the ZeroSpeech Challenge 2021
Takashi Maekaku
Xuankai Chang
Yuya Fujita
Li-Wei Chen
Shinji Watanabe
Alexander I. Rudnicky
104
13
0
13 Jul 2021
Dropout Regularization for Self-Supervised Learning of Transformer
  Encoder Speech Representation
Dropout Regularization for Self-Supervised Learning of Transformer Encoder Speech Representation
Jian Luo
Jianzong Wang
Ning Cheng
Jing Xiao
SSL
16
6
0
09 Jul 2021
On lattice-free boosted MMI training of HMM and CTC-based full-context
  ASR models
On lattice-free boosted MMI training of HMM and CTC-based full-context ASR models
Xiaohui Zhang
Vimal Manohar
David C. Zhang
Frank Zhang
Yangyang Shi
Nayan Singhal
Julian Chan
Fuchun Peng
Yatharth Saraf
M. Seltzer
12
14
0
09 Jul 2021
Improved Language Identification Through Cross-Lingual Self-Supervised
  Learning
Improved Language Identification Through Cross-Lingual Self-Supervised Learning
Andros Tjandra
Diptanu Gon Choudhury
Frank Zhang
Kritika Singh
Alexis Conneau
Alexei Baevski
Assaf Sela
Yatharth Saraf
Michael Auli
VLM
SSL
21
35
0
08 Jul 2021
Advancing CTC-CRF Based End-to-End Speech Recognition with Wordpieces
  and Conformers
Advancing CTC-CRF Based End-to-End Speech Recognition with Wordpieces and Conformers
Huahuan Zheng
Wenjie Peng
Zhijian Ou
Jinsong Zhang
10
5
0
07 Jul 2021
GLiT: Neural Architecture Search for Global and Local Image Transformer
GLiT: Neural Architecture Search for Global and Local Image Transformer
Boyu Chen
Peixia Li
Chuming Li
Baopu Li
Lei Bai
Chen Lin
Ming-hui Sun
Junjie Yan
Wanli Ouyang
ViT
24
85
0
07 Jul 2021
A Comparative Study of Modular and Joint Approaches for
  Speaker-Attributed ASR on Monaural Long-Form Audio
A Comparative Study of Modular and Joint Approaches for Speaker-Attributed ASR on Monaural Long-Form Audio
Naoyuki Kanda
Xiong Xiao
Jian Wu
Tianyan Zhou
Yashesh Gaur
Xiaofei Wang
Zhong Meng
Zhuo Chen
Takuya Yoshioka
6
13
0
06 Jul 2021
The NiuTrans End-to-End Speech Translation System for IWSLT 2021 Offline
  Task
The NiuTrans End-to-End Speech Translation System for IWSLT 2021 Offline Task
Chen Xu
Xiaoqian Liu
Xiaowen Liu
Laohu Wang
Canan Huang
Tong Xiao
Jingbo Zhu
23
5
0
06 Jul 2021
Investigation of Practical Aspects of Single Channel Speech Separation
  for ASR
Investigation of Practical Aspects of Single Channel Speech Separation for ASR
Jian Wu
Zhuo Chen
Sanyuan Chen
Yu-Huan Wu
Takuya Yoshioka
Naoyuki Kanda
Shujie Liu
Jinyu Li
14
17
0
05 Jul 2021
Relaxed Attention: A Simple Method to Boost Performance of End-to-End
  Automatic Speech Recognition
Relaxed Attention: A Simple Method to Boost Performance of End-to-End Automatic Speech Recognition
Timo Lohrenz
P. Schwarz
Zhengyang Li
Tim Fingscheidt
13
11
0
02 Jul 2021
Dual Causal/Non-Causal Self-Attention for Streaming End-to-End Speech
  Recognition
Dual Causal/Non-Causal Self-Attention for Streaming End-to-End Speech Recognition
Niko Moritz
Takaaki Hori
Jonathan Le Roux
6
20
0
02 Jul 2021
ESPnet-ST IWSLT 2021 Offline Speech Translation System
ESPnet-ST IWSLT 2021 Offline Speech Translation System
H. Inaguma
Shun Kiyono
Nelson Enrique Yalta Soplin
Pengcheng Guo
Jun Suzuki
Kevin Duh
Shinji Watanabe
3DV
32
2
0
01 Jul 2021
StableEmit: Selection Probability Discount for Reducing Emission Latency
  of Streaming Monotonic Attention ASR
StableEmit: Selection Probability Discount for Reducing Emission Latency of Streaming Monotonic Attention ASR
H. Inaguma
Tatsuya Kawahara
17
4
0
01 Jul 2021
IMS' Systems for the IWSLT 2021 Low-Resource Speech Translation Task
IMS' Systems for the IWSLT 2021 Low-Resource Speech Translation Task
Pavel Denisov
Manuel Mager
Ngoc Thang Vu
27
6
0
30 Jun 2021
DF-Conformer: Integrated architecture of Conv-TasNet and Conformer using
  linear complexity self-attention for speech enhancement
DF-Conformer: Integrated architecture of Conv-TasNet and Conformer using linear complexity self-attention for speech enhancement
Yuma Koizumi
Shigeki Karita
Scott Wisdom
Hakan Erdogan
J. Hershey
Llion Jones
M. Bacchiani
19
41
0
30 Jun 2021
N-Singer: A Non-Autoregressive Korean Singing Voice Synthesis System for
  Pronunciation Enhancement
N-Singer: A Non-Autoregressive Korean Singing Voice Synthesis System for Pronunciation Enhancement
Gyeong-Hoon Lee
Tae-Woo Kim
Hanbin Bae
Min-Ji Lee
Young-Ik Kim
Hoon-Young Cho
VLM
6
19
0
29 Jun 2021
Stable, Fast and Accurate: Kernelized Attention with Relative Positional
  Encoding
Stable, Fast and Accurate: Kernelized Attention with Relative Positional Encoding
Shengjie Luo
Shanda Li
Tianle Cai
Di He
Dinglan Peng
Shuxin Zheng
Guolin Ke
Liwei Wang
Tie-Yan Liu
19
50
0
23 Jun 2021
An Improved Single Step Non-autoregressive Transformer for Automatic
  Speech Recognition
An Improved Single Step Non-autoregressive Transformer for Automatic Speech Recognition
Ruchao Fan
Wei Chu
Peng Chang
Jing Xiao
Abeer Alwan
14
15
0
18 Jun 2021
Multi-mode Transformer Transducer with Stochastic Future Context
Multi-mode Transformer Transducer with Stochastic Future Context
Kwangyoun Kim
Felix Wu
Prashant Sridhar
Kyu Jeong Han
Shinji Watanabe
30
9
0
17 Jun 2021
Efficient Conformer with Prob-Sparse Attention Mechanism for
  End-to-EndSpeech Recognition
Efficient Conformer with Prob-Sparse Attention Mechanism for End-to-EndSpeech Recognition
Xiong Wang
Sining Sun
Lei Xie
Long Ma
16
18
0
17 Jun 2021
Previous
123...3132333435
Next