Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1911.08460
Cited By
End-to-end ASR: from Supervised to Semi-Supervised Learning with Modern Architectures
19 November 2019
Gabriel Synnaeve
Qiantong Xu
Jacob Kahn
Tatiana Likhomanenko
Edouard Grave
Vineel Pratap
Anuroop Sriram
Vitaliy Liptchinsky
R. Collobert
SSL
AI4TS
Re-assign community
ArXiv
PDF
HTML
Papers citing
"End-to-end ASR: from Supervised to Semi-Supervised Learning with Modern Architectures"
50 / 62 papers shown
Title
Unlock the Power of Unlabeled Data in Language Driving Model
Chaoqun Wang
Jie-jin Yang
Xiaobin Hong
Ruimao Zhang
53
0
0
13 Mar 2025
emg2qwerty: A Large Dataset with Baselines for Touch Typing using Surface Electromyography
Viswanath Sivakumar
Jeffrey Seely
Alan Du
Sean R Bittner
Adam Berenzweig
Anuoluwapo Bolarinwa
Alexandre Gramfort
Michael I Mandel
13
3
0
26 Oct 2024
Denoising LM: Pushing the Limits of Error Correction Models for Speech Recognition
Zijin Gu
Tatiana Likhomanenko
Richard He Bai
Erik McDermott
R. Collobert
Navdeep Jaitly
AuLLM
51
2
0
24 May 2024
Preuve de concept dún bot vocal dialoguant en wolof
E. Gauthier
Papa Séga Wade
Thierry Moudenc
Patrice Collen
Emilie Guimier De Neef
Oumar Ba
Ndeye Khoyane Cama
Ahmadou Bamba Kebe
Ndeye Aissatou Gningue
Thomas MendoÓ Aristide
29
3
0
02 Apr 2024
Multi-Stage Multi-Modal Pre-Training for Automatic Speech Recognition
Yash Jain
David M. Chan
Pranav Dheram
Aparna Khare
Olabanji Shonibare
Venkatesh Ravichandran
Shalini Ghosh
40
2
0
28 Mar 2024
TransPose: 6D Object Pose Estimation with Geometry-Aware Transformer
Xiao Lin
Deming Wang
Guangliang Zhou
Chengju Liu
Qi Chen
3DPC
ViT
28
8
0
25 Oct 2023
Temporally Aligning Long Audio Interviews with Questions: A Case Study in Multimodal Data Integration
Piyush Singh Pasi
Karthikeya Battepati
P. Jyothi
Ganesh Ramakrishnan
T. Mahapatra
Manoj Singh
51
0
0
10 Oct 2023
AVFormer: Injecting Vision into Frozen Speech Models for Zero-Shot AV-ASR
Paul Hongsuck Seo
Arsha Nagrani
Cordelia Schmid
29
15
0
29 Mar 2023
Improved Self-Supervised Multilingual Speech Representation Learning Combined with Auxiliary Language Information
Fenglin Ding
Genshun Wan
Pengcheng Li
Jia-Yu Pan
Cong Liu
SSL
25
1
0
07 Dec 2022
Continuous Soft Pseudo-Labeling in ASR
Tatiana Likhomanenko
R. Collobert
Navdeep Jaitly
Samy Bengio
VLM
22
3
0
11 Nov 2022
Iterative pseudo-forced alignment by acoustic CTC loss for self-supervised ASR domain adaptation
F. López
Jordi Luque
6
6
0
27 Oct 2022
SpeechUT: Bridging Speech and Text with Hidden-Unit for Encoder-Decoder Based Speech-Text Pre-training
Zi-Hua Zhang
Long Zhou
Junyi Ao
Shujie Liu
Lirong Dai
Jinyu Li
Furu Wei
61
57
0
07 Oct 2022
E-Branchformer: Branchformer with Enhanced merging for speech recognition
Kwangyoun Kim
Felix Wu
Yifan Peng
Jing Pan
Prashant Sridhar
Kyu Jeong Han
Shinji Watanabe
55
105
0
30 Sep 2022
3D Siamese Transformer Network for Single Object Tracking on Point Clouds
Le Hui
Lingpeng Wang
Ling-Yu Tang
Kaihao Lan
Jin Xie
Jian Yang
ViT
3DPC
31
59
0
25 Jul 2022
Data Augmentation for Low-Resource Quechua ASR Improvement
Rodolfo Zevallos
Núria Bel
Guillermo Cámbara
Mireia Farrús
Jordi Luque
VLM
SyDa
19
6
0
14 Jul 2022
Boosting Cross-Domain Speech Recognition with Self-Supervision
Hanjing Zhu
Gaofeng Cheng
Jindong Wang
Wenxin Hou
Pengyuan Zhang
Yonghong Yan
19
13
0
20 Jun 2022
Censer: Curriculum Semi-supervised Learning for Speech Recognition Based on Self-supervised Pre-training
Bowen Zhang
Songjun Cao
Xiaoming Zhang
Yike Zhang
Long Ma
T. Shinozaki
SSL
20
4
0
16 Jun 2022
Transformer Lesion Tracker
Wen Tang
Han Kang
Haoyue Zhang
Pengxin Yu
C. Arnold
Rongguo Zhang
MedIm
19
6
0
13 Jun 2022
Federated Learning with Partial Model Personalization
Krishna Pillutla
Kshitiz Malik
Abdel-rahman Mohamed
Michael G. Rabbat
Maziar Sanjabi
Lin Xiao
FedML
41
154
0
08 Apr 2022
High-Performance Transformer Tracking
Xin Chen
B. Yan
Jiawen Zhu
Huchuan Lu
Xiang Ruan
D. Wang
ViT
23
33
0
25 Mar 2022
Improving Automatic Speech Recognition for Non-Native English with Transfer Learning and Language Model Decoding
Peter Sullivan
Toshiko Shibano
Muhammad Abdul-Mageed
34
11
0
10 Feb 2022
Mixture-of-Rookies: Saving DNN Computations by Predicting ReLU Outputs
D. Pinto
J. Arnau
Antonio González
31
1
0
10 Feb 2022
SPIRAL: Self-supervised Perturbation-Invariant Representation Learning for Speech Pre-Training
Wenyong Huang
Zhenhe Zhang
Y. Yeung
Xin Jiang
Qun Liu
33
23
0
25 Jan 2022
Miti-DETR: Object Detection based on Transformers with Mitigatory Self-Attention Convergence
Wenchi Ma
Tianxiao Zhang
Guanghui Wang
ViT
33
14
0
26 Dec 2021
Voice Quality and Pitch Features in Transformer-Based Speech Recognition
Guillermo Cámbara
Jordi Luque
Mireia Farrús
19
0
0
21 Dec 2021
Self-Supervised Learning for speech recognition with Intermediate layer supervision
Chengyi Wang
Yu-Huan Wu
Sanyuan Chen
Shujie Liu
Jinyu Li
Yao Qian
Zhenglu Yang
SSL
18
28
0
16 Dec 2021
Textless Speech-to-Speech Translation on Real Data
Ann Lee
Hongyu Gong
Paul-Ambroise Duquenne
Holger Schwenk
Peng-Jen Chen
...
Sravya Popuri
Yossi Adi
J. Pino
Jiatao Gu
Wei-Ning Hsu
28
142
0
15 Dec 2021
Towards Building ASR Systems for the Next Billion Users
Tahir Javed
Sumanth Doddapaneni
A. Raman
Kaushal Bhogale
Gowtham Ramesh
Anoop Kunchukuttan
Pratyush Kumar
Mitesh M. Khapra
44
53
0
06 Nov 2021
WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing
Sanyuan Chen
Chengyi Wang
Zhengyang Chen
Yu-Huan Wu
Shujie Liu
...
Yao Qian
Jian Wu
Micheal Zeng
Xiangzhan Yu
Furu Wei
SSL
95
1,700
0
26 Oct 2021
Multi-Modal Pre-Training for Automated Speech Recognition
David M. Chan
Shalini Ghosh
D. Chakrabarty
Björn Hoffmeister
SSL
22
16
0
12 Oct 2021
Word Order Does Not Matter For Speech Recognition
Vineel Pratap
Qiantong Xu
Tatiana Likhomanenko
Gabriel Synnaeve
R. Collobert
32
4
0
12 Oct 2021
Revitalizing CNN Attentions via Transformers in Self-Supervised Visual Representation Learning
Chongjian Ge
Youwei Liang
Yibing Song
Jianbo Jiao
Jue Wang
Ping Luo
ViT
21
36
0
11 Oct 2021
An Exploration of Self-Supervised Pretrained Representations for End-to-End Speech Recognition
Xuankai Chang
Takashi Maekaku
Pengcheng Guo
Jing Shi
Yen-Ju Lu
...
Tianzi Wang
Shu-Wen Yang
Yu Tsao
Hung-yi Lee
Shinji Watanabe
SSL
AI4TS
16
81
0
09 Oct 2021
Unsupervised Speech Segmentation and Variable Rate Representation Learning using Segmental Contrastive Predictive Coding
Saurabhchand Bhati
Jesús Villalba
Piotr Żelasko
Laureano Moro Velázquez
Najim Dehak
SSL
53
22
0
05 Oct 2021
An End-to-End Transformer Model for 3D Object Detection
Ishan Misra
Rohit Girdhar
Armand Joulin
3DPC
ViT
39
470
0
16 Sep 2021
Self-training Improves Pre-training for Few-shot Learning in Task-oriented Dialog Systems
Fei Mi
Wanhao Zhou
Feng Cai
Lingjing Kong
Minlie Huang
Boi Faltings
27
32
0
28 Aug 2021
The HW-TSC's Offline Speech Translation Systems for IWSLT 2021 Evaluation
Minghan Wang
Yuxia Wang
Chang Su
Jiaxin Guo
Yingtao Zhang
...
Shimin Tao
Xingshan Zeng
Liangyou Li
Hao Yang
Ying Qin
17
6
0
09 Aug 2021
Improved Language Identification Through Cross-Lingual Self-Supervised Learning
Andros Tjandra
Diptanu Gon Choudhury
Frank Zhang
Kritika Singh
Alexis Conneau
Alexei Baevski
Assaf Sela
Yatharth Saraf
Michael Auli
VLM
SSL
24
35
0
08 Jul 2021
SynthASR: Unlocking Synthetic Data for Speech Recognition
A. Fazel
Wei Yang
Yulan Liu
Roberto Barra-Chicote
Yi Meng
Roland Maas
J. Droppo
SyDa
13
48
0
14 Jun 2021
Unsupervised Automatic Speech Recognition: A Review
Hanan Aldarmaki
Asad Ullah
Nazar Zaki
VLM
SSL
36
56
0
09 Jun 2021
TrTr: Visual Tracking with Transformer
Moju Zhao
K. Okada
Masayuki Inaba
ViT
28
79
0
09 May 2021
Label-Synchronous Speech-to-Text Alignment for ASR Using Forward and Backward Transformers
Yusuke Kida
Tatsuya Komatsu
M. Togami
16
1
0
21 Apr 2021
Large-Scale Self- and Semi-Supervised Learning for Speech Translation
Changhan Wang
Anne Wu
J. Pino
Alexei Baevski
Michael Auli
Alexis Conneau
SSL
31
44
0
14 Apr 2021
Comparing the Benefit of Synthetic Training Data for Various Automatic Speech Recognition Architectures
Nick Rossenbach
Mohammad Zeineldeen
Benedikt Hilmes
Ralf Schluter
Hermann Ney
28
12
0
12 Apr 2021
Going deeper with Image Transformers
Hugo Touvron
Matthieu Cord
Alexandre Sablayrolles
Gabriel Synnaeve
Hervé Jégou
ViT
27
986
0
31 Mar 2021
Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking
Ning Wang
Wen-gang Zhou
Jie Wang
Houqiang Li
ViT
28
518
0
22 Mar 2021
Transformer-based ASR Incorporating Time-reduction Layer and Fine-tuning with Self-Knowledge Distillation
Md. Akmal Haidar
Chao Xing
Mehdi Rezagholizadeh
21
7
0
17 Mar 2021
Contrastive Semi-supervised Learning for ASR
Alex Xiao
Christian Fuegen
Abdel-rahman Mohamed
26
20
0
09 Mar 2021
A Parallelizable Lattice Rescoring Strategy with Neural Language Models
Ke Li
Daniel Povey
Sanjeev Khudanpur
13
16
0
08 Mar 2021
LEAF: A Learnable Frontend for Audio Classification
Neil Zeghidour
O. Teboul
Félix de Chaumont Quitry
Marco Tagliasacchi
VLM
AAML
82
144
0
21 Jan 2021
1
2
Next