Learning Robust and Multilingual Speech Representations

29 January 2020

Papers citing "Learning Robust and Multilingual Speech Representations"

30 / 30 papers shown

Title
Efficient Training of Self-Supervised Speech Foundation Models on a Compute Budget Andy T. Liu Yi-Cheng Lin Haibin Wu Stefan Winkler Hung-yi Lee 31 1 0 09 Sep 2024
Cross-Modal Coordination Across a Diverse Set of Input Modalities Jorge Sánchez Rodrigo Laguna VLM 35 0 0 29 Jan 2024
Findings of the 2023 ML-SUPERB Challenge: Pre-Training and Evaluation over More Languages and Beyond Jiatong Shi William Chen Dan Berrebbi Hsiu-Hsuan Wang Wei-Ping Huang ... Yuxun Tang Shang-Wen Li Abdelrahman Mohamed Hung-yi Lee Shinji Watanabe LRM ELM 34 15 0 09 Oct 2023
Exploration of Language Dependency for Japanese Self-Supervised Speech Representation Models Takanori Ashihara Takafumi Moriya Kohei Matsuura Tomohiro Tanaka 25 3 0 09 May 2023
Transformers in Speech Processing: A Survey S. Latif Aun Zaidi Heriberto Cuayáhuitl Fahad Shamshad Moazzam Shoukat Junaid Qadir 42 47 0 21 Mar 2023
Supervised Acoustic Embeddings And Their Transferability Across Languages Sreepratha Ram Hanan Aldarmaki SSL 19 3 0 03 Jan 2023
Whodunit? Learning to Contrast for Authorship Attribution Bo Ai Yuchen Wang Yugin Tan Samson Tan SSL 16 16 0 23 Sep 2022
u-HuBERT: Unified Mixed-Modal Speech Pretraining And Zero-Shot Transfer to Unlabeled Modality Wei-Ning Hsu Bowen Shi SSL VLM 19 41 0 14 Jul 2022
Joint Training of Speech Enhancement and Self-supervised Model for Noise-robust ASR Qiu-shi Zhu Jie M. Zhang Zitian Zhang Lirong Dai 35 15 0 26 May 2022
Self-Supervised Speech Representation Learning: A Review Abdel-rahman Mohamed Hung-yi Lee Lasse Borgholt Jakob Drachmann Havtorn Joakim Edin ... Shang-Wen Li Karen Livescu Lars Maaløe Tara N. Sainath Shinji Watanabe SSL AI4TS 128 349 0 21 May 2022
How Does Pre-trained Wav2Vec 2.0 Perform on Domain Shifted ASR? An Extensive Benchmark on Air Traffic Control Communications Juan Pablo Zuluaga Amrutha Prasad Iuliia Nigmatulina Seyyed Saeed Sarfjoo P. Motlícek Matthias Kleinert H. Helmke Oliver Ohneiser Qingran Zhan 13 43 0 31 Mar 2022
Audio Self-supervised Learning: A Survey Shuo Liu Adria Mallol-Ragolta Emilia Parada-Cabeleiro Kun Qian Xingshuo Jing Alexander Kathan Bin Hu Bjoern W. Schuller SSL 35 106 0 02 Mar 2022
A Brief Overview of Unsupervised Neural Speech Representation Learning Lasse Borgholt Jakob Drachmann Havtorn Joakim Edin Lars Maaløe Christian Igel BDL AI4TS SSL 19 11 0 01 Mar 2022
Automatic speaker verification spoofing and deepfake detection using wav2vec 2.0 and data augmentation Hemlata Tak Massimiliano Todisco Xin Wang Jee-weon Jung Junichi Yamagishi Nicholas W. D. Evans 32 151 0 24 Feb 2022
Differentially Private Speaker Anonymization Ali Shahin Shamsabadi B. M. L. Srivastava A. Bellet Nathalie Vauquier Emmanuel Vincent Mohamed Maouche Marc Tommasi Nicolas Papernot MIACV 38 32 0 23 Feb 2022
XLS-R: Self-supervised Cross-lingual Speech Representation Learning at Scale Arun Babu Changhan Wang Andros Tjandra Kushal Lakhotia Qiantong Xu ... Yatharth Saraf J. Pino Alexei Baevski Alexis Conneau Michael Auli SSL 19 656 0 17 Nov 2021
Improving Noise Robustness of Contrastive Speech Representation Learning with Speech Reconstruction Heming Wang Yao Qian Xiaofei Wang Yiming Wang Chengyi Wang Shujie Liu Takuya Yoshioka Jinyu Li DeLiang Wang 13 29 0 28 Oct 2021
Learning Speaker Representation with Semi-supervised Learning approach for Speaker Profiling Shangeth Rajaa Pham Van Tung Chng Eng Siong 28 5 0 24 Oct 2021
Injecting Text and Cross-lingual Supervision in Few-shot Learning from Self-Supervised Models Matthew Wiesner Desh Raj Sanjeev Khudanpur 51 6 0 10 Oct 2021
Comparison of Self-Supervised Speech Pre-Training Methods on Flemish Dutch Jakob Poncelet Hugo Van hamme SSL 23 1 0 29 Sep 2021
Robust wav2vec 2.0: Analyzing Domain Shift in Self-Supervised Pre-Training Wei-Ning Hsu Anuroop Sriram Alexei Baevski Tatiana Likhomanenko Qiantong Xu ... Jacob Kahn Ann Lee R. Collobert Gabriel Synnaeve Michael Auli SSL 14 235 0 02 Apr 2021
Multilingual Speech Translation with Efficient Finetuning of Pretrained Models Xian Li Changhan Wang Yun Tang C. Tran Yuqing Tang J. Pino Alexei Baevski Alexis Conneau Michael Auli 19 6 0 24 Oct 2020
Representation Learning for Sequence Data with Deep Autoencoding Predictive Components Junwen Bai Weiran Wang Yingbo Zhou Caiming Xiong SSL AI4TS 18 12 0 07 Oct 2020
TERA: Self-Supervised Learning of Transformer Encoder Representation for Speech Andy T. Liu Shang-Wen Li Hung-yi Lee SSL 48 356 0 12 Jul 2020
Data Augmenting Contrastive Learning of Speech Representations in the Time Domain Eugene Kharitonov M. Rivière Gabriel Synnaeve Lior Wolf Pierre-Emmanuel Mazaré Matthijs Douze Emmanuel Dupoux 23 117 0 02 Jul 2020
Unsupervised Cross-lingual Representation Learning for Speech Recognition Alexis Conneau Alexei Baevski R. Collobert Abdel-rahman Mohamed Michael Auli SSL 41 754 0 24 Jun 2020
Self-Supervised Representations Improve End-to-End Speech Translation Anne Wu Changhan Wang J. Pino Jiatao Gu SSL 17 40 0 22 Jun 2020
High-Fidelity Audio Generation and Representation Learning with Guided Adversarial Autoencoder Kazi Nazmul Haque R. Rana Björn W Schuller DRL 24 12 0 01 Jun 2020
BERTphone: Phonetically-Aware Encoder Representations for Utterance-Level Speaker and Language Recognition Shaoshi Ling Julian Salazar Yuzong Liu Katrin Kirchhoff SSL 16 27 0 30 Jun 2019
Contrastive Multiview Coding Yonglong Tian Dilip Krishnan Phillip Isola SSL 56 2,360 0 13 Jun 2019