Self-training and Pre-training are Complementary for Speech Recognition

22 October 2020

Papers citing "Self-training and Pre-training are Complementary for Speech Recognition"

35 / 35 papers shown

Title
Soft-Weighted CrossEntropy Loss for Continous Alzheimer's Disease Detection Xiaohui Zhang Wenjie Fu Mangui Liang 35 1 0 19 Feb 2024
Cross-Domain HAR: Few Shot Transfer Learning for Human Activity Recognition Megha Thukral H. Haresamudram Thomas Ploetz 26 4 0 22 Oct 2023
Fast Word Error Rate Estimation Using Self-Supervised Representations for Speech and Text Chanho Park Chengsong Lu Mingjie Chen Thomas Hain 28 3 0 12 Oct 2023
Some voices are too common: Building fair speech recognition systems using the Common Voice dataset Lucas Maison Yannick Esteve 26 3 0 01 Jun 2023
Rethinking Semi-supervised Learning with Language Models Zhengxiang Shi Francesco Tonolini Nikolaos Aletras Emine Yilmaz G. Kazai Yunlong Jiao 27 18 0 22 May 2023
Deep Transfer Learning for Automatic Speech Recognition: Towards Better Generalization Hamza Kheddar Yassine Himeur S. Al-Maadeed Abbes Amira F. Bensaali 42 76 0 27 Apr 2023
AVFormer: Injecting Vision into Frozen Speech Models for Zero-Shot AV-ASR Paul Hongsuck Seo Arsha Nagrani Cordelia Schmid 27 15 0 29 Mar 2023
ReVISE: Self-Supervised Speech Resynthesis with Visual Input for Universal and Generalized Speech Enhancement Wei-Ning Hsu Tal Remez Bowen Shi Jacob Donley Yossi Adi DiffM 27 11 0 21 Dec 2022
TriNet: stabilizing self-supervised learning from complete or slow collapse on ASR Lixin Cao J. Wang Ben Yang Dan Su Dong Yu 18 4 0 12 Dec 2022
Continuous Soft Pseudo-Labeling in ASR Tatiana Likhomanenko R. Collobert Navdeep Jaitly Samy Bengio VLM 19 3 0 11 Nov 2022
Make More of Your Data: Minimal Effort Data Augmentation for Automatic Speech Recognition and Translation Tsz Kin Lam Shigehiko Schamoni Stefan Riezler VLM 34 8 0 27 Oct 2022
I see what you hear: a vision-inspired method to localize words Mohammad Samragh Arnav Kundu Ting-Yao Hu Minsik Cho Aman Chadha A. Shrivastava Oncel Tuzel Devang Naik ObjD 27 1 0 24 Oct 2022
Transfer Learning of wav2vec 2.0 for Automatic Lyric Transcription Longshen Ou Xiangming Gu Ye Wang 25 21 0 20 Jul 2022
Improving Low-Resource Speech Recognition with Pretrained Speech Models: Continued Pretraining vs. Semi-Supervised Training Mitchell DeHaven J. Billa VLM AI4TS 15 8 0 01 Jul 2022
Do self-supervised speech models develop human-like perception biases? Juliette Millet Ewan Dunbar SSL 19 20 0 31 May 2022
A Deep Reinforcement Learning Blind AI in DareFightingICE Thai Van Nguyen Xincheng Dai Ibrahim Khan R. Thawonmas H. V. Pham VLM 23 7 0 16 May 2022
Wav2Seq: Pre-training Speech-to-Text Encoder-Decoder Models Using Pseudo Languages Felix Wu Kwangyoun Kim Shinji Watanabe Kyu Jeong Han Ryan T. McDonald Kilian Q. Weinberger Yoav Artzi SyDa 40 37 0 02 May 2022
A Complementary Joint Training Approach Using Unpaired Speech and Text for Low-Resource Automatic Speech Recognition Ye Du Jie M. Zhang Qiu-shi Zhu Lirong Dai Ming Wu Xin Fang Zhouwang Yang 24 2 0 05 Apr 2022
Adversarial Speaker Distillation for Countermeasure Model on Automatic Speaker Verification Yen-Lun Liao Xuan-Bo Chen Chung-Che Wang J. Jang AAML 33 8 0 31 Mar 2022
Automatic speaker verification spoofing and deepfake detection using wav2vec 2.0 and data augmentation Hemlata Tak Massimiliano Todisco Xin Wang Jee-weon Jung Junichi Yamagishi Nicholas W. D. Evans 32 151 0 24 Feb 2022
Self-Training: A Survey Massih-Reza Amini Vasilii Feofanov Loïc Pauletto Lies Hadjadj Emilie Devijver Yury Maximov SSL 28 102 0 24 Feb 2022
Improving Automatic Speech Recognition for Non-Native English with Transfer Learning and Language Model Decoding Peter Sullivan Toshiko Shibano Muhammad Abdul-Mageed 34 11 0 10 Feb 2022
Efficient Adapter Transfer of Self-Supervised Speech Models for Automatic Speech Recognition Bethan Thomas Samuel Kessler S. Karout 16 70 0 07 Feb 2022
SPIRAL: Self-supervised Perturbation-Invariant Representation Learning for Speech Pre-Training Wenyong Huang Zhenhe Zhang Y. Yeung Xin Jiang Qun Liu 30 23 0 25 Jan 2022
Sign Language Video Retrieval with Free-Form Textual Queries A. Duarte Samuel Albanie Xavier Giró-i-Nieto Gül Varol SLR 29 29 0 07 Jan 2022
Automatic Speech Recognition Datasets in Cantonese: A Survey and New Dataset Tiezheng Yu Rita Frieske Peng-Tao Xu Samuel Cahyawijaya Cheuk Tung Shadow Yiu ... Elham J. Barezi Qifeng Chen Xiaojuan Ma Bertram E. Shi Pascale Fung RALM 34 9 0 07 Jan 2022
On the Use of External Data for Spoken Named Entity Recognition Ankita Pasad Felix Wu Suwon Shon Karen Livescu Kyu Jeong Han 32 16 0 14 Dec 2021
XLS-R: Self-supervised Cross-lingual Speech Representation Learning at Scale Arun Babu Changhan Wang Andros Tjandra Kushal Lakhotia Qiantong Xu ... Yatharth Saraf J. Pino Alexei Baevski Alexis Conneau Michael Auli SSL 21 656 0 17 Nov 2021
Comparison of Self-Supervised Speech Pre-Training Methods on Flemish Dutch Jakob Poncelet Hugo Van hamme SSL 25 1 0 29 Sep 2021
Task-adaptive Pre-training and Self-training are Complementary for Natural Language Understanding Shiyang Li Semih Yavuz Wenhu Chen Xifeng Yan 20 12 0 14 Sep 2021
Remember the context! ASR slot error correction through memorization Dhanush Bekal Ashish Shenoy Monica Sunkara S. Bodapati Katrin Kirchhoff KELM 23 12 0 10 Sep 2021
Multi-Task Self-Training for Learning General Representations Golnaz Ghiasi Barret Zoph E. D. Cubuk Quoc V. Le Tsung-Yi Lin SSL 24 100 0 25 Aug 2021
Large-Scale Self- and Semi-Supervised Learning for Speech Translation Changhan Wang Anne Wu J. Pino Alexei Baevski Michael Auli Alexis Conneau SSL 31 44 0 14 Apr 2021
Going deeper with Image Transformers Hugo Touvron Matthieu Cord Alexandre Sablayrolles Gabriel Synnaeve Hervé Jégou ViT 25 986 0 31 Mar 2021
VoxPopuli: A Large-Scale Multilingual Speech Corpus for Representation Learning, Semi-Supervised Learning and Interpretation Changhan Wang M. Rivière Ann Lee Anne Wu Chaitanya Talnikar Daniel Haziza Mary Williamson J. Pino Emmanuel Dupoux SSL 21 459 0 02 Jan 2021