Self-supervised language learning from raw audio: Lessons from the Zero Resource Speech Challenge

27 October 2022

Papers citing "Self-supervised language learning from raw audio: Lessons from the Zero Resource Speech Challenge"

29 / 29 papers shown

Title
fastabx: A library for efficient computation of ABX discriminability Maxime Poli Emmanuel Chemla Emmanuel Dupoux 29 0 0 05 May 2025
Developmental Predictive Coding Model for Early Infancy Mono and Bilingual Vocal Continual Learning Xiaodan Chen Alexandre Pitti M. Quoy Nancy F Chen CLL 34 0 0 23 Dec 2024
Analytic Study of Text-Free Speech Synthesis for Raw Audio using a Self-Supervised Learning Model Joonyong Park Daisuke Saito N. Minematsu 67 0 0 04 Dec 2024
SKQVC: One-Shot Voice Conversion by K-Means Quantization with Self-Supervised Speech Representations Youngjun Sim Jinsung Yoon Young-Joo Suh 64 0 0 25 Nov 2024
From Babble to Words: Pre-Training Language Models on Continuous Streams of Phonemes Zébulon Goriely Richard Diehl Martinez Andrew Caines Lisa Beinborn P. Buttery CLL 24 5 0 30 Oct 2024
Anonymising Elderly and Pathological Speech: Voice Conversion Using DDSP and Query-by-Example Suhita Ghosh Melanie Jouaiti Arnab Das Yamini Sinha Tim Polzehl Ingo Siegert Sebastian Stober 18 2 0 20 Oct 2024
Textless NLP -- Zero Resource Challenge with Low Resource Compute Krithiga Ramadass Abrit Pal Singh Srihari J Sheetal Kalyani VLM 13 0 0 24 Sep 2024
Improving Spoken Language Modeling with Phoneme Classification: A Simple Fine-tuning Approach Maxime Poli Emmanuel Chemla Emmanuel Dupoux 21 2 0 16 Sep 2024
STAB: Speech Tokenizer Assessment Benchmark Shikhar Vashishth Harman Singh Shikhar Bharadwaj Sriram Ganapathy Chulayuth Asawaroengchai Kartik Audhkhasi Andrew Rosenberg Ankur Bapna Bhuvana Ramabhadran 41 0 0 04 Sep 2024
SpeechPrompt: Prompting Speech Language Models for Speech Processing Tasks Kai-Wei Chang Haibin Wu Yu-Kai Wang Yuan-Kuei Wu Hua Shen Wei-Cheng Tseng Iu-thing Kang Shang-Wen Li Hung-yi Lee 31 1 0 23 Aug 2024
Simulating Articulatory Trajectories with Phonological Feature Interpolation Angelo Ortiz Tandazo Thomas Schatz Thomas Hueber Emmanuel Dupoux 17 0 0 08 Aug 2024
A model of early word acquisition based on realistic-scale audiovisual naming events Khazar Khorrami Okko Rasanen NAI 22 0 0 07 Jun 2024
Investigating the Áutoencoder Behavior' in Speech Self-Supervised Models: a focus on HuBERT's Pretraining Valentin Vielzeuf SSL 23 0 0 14 May 2024
Age-Dependent Analysis and Stochastic Generation of Child-Directed Speech Okko Rasanen Daniil Kocharov 22 0 0 13 May 2024
Bigger is not Always Better: The Effect of Context Size on Speech Pre-Training Sean Robertson Ewan Dunbar SSL 8 1 0 03 Dec 2023
Audio-Visual Neural Syntax Acquisition Cheng-I Jeff Lai Freda Shi Puyuan Peng Yoon Kim Kevin Gimpel ... David D. Cox David F. Harwath Yang Zhang Karen Livescu James R. Glass CLIP NAI 32 1 0 11 Oct 2023
XLS-R fine-tuning on noisy word boundaries for unsupervised speech segmentation into words Robin Algayres Pablo Diego-Simon Benoît Sagot Emmanuel Dupoux 10 1 0 08 Oct 2023
Generative Spoken Language Model based on continuous word-sized audio tokens Robin Algayres Yossi Adi Tu Nguyen Jade Copet Gabriel Synnaeve Benoît Sagot Emmanuel Dupoux AuLLM 24 6 0 08 Oct 2023
SALT: Distinguishable Speaker Anonymization Through Latent Space Transformation Yuanjun Lv Jixun Yao Peikun Chen Hongbin Zhou Heng Lu Lei Xie 17 4 0 08 Oct 2023
Simultaneous or Sequential Training? How Speech Representations Cooperate in a Multi-Task Self-Supervised Learning System Khazar Khorrami María Andrea Cruz Blandón Tuomas Virtanen Okko Rasanen SSL 20 1 0 05 Jun 2023
Voice Conversion With Just Nearest Neighbors Matthew Baas Benjamin van Niekerk Herman Kamper SSL 19 48 0 30 May 2023
Syllable Discovery and Cross-Lingual Generalization in a Visually Grounded, Self-Supervised Speech Model Puyuan Peng Shang-Wen Li Okko Rasanen Abdel-rahman Mohamed David F. Harwath SSL VLM 13 7 0 19 May 2023
A benchmark for computational analysis of animal behavior, using animal-borne tags Benjamin Hoffman M. Cusimano V. Baglione D. Canestrari D. Chevallier ... O. Vainio A. Vehkaoja Ken Yoda Katie Zacarian A. Friedlaender 10 7 0 18 May 2023
Evaluating context-invariance in unsupervised speech representations Mark Hallap Emmanuel Dupoux Ewan Dunbar SSL 21 9 0 27 Oct 2022
Unsupervised Speech Segmentation and Variable Rate Representation Learning using Segmental Contrastive Predictive Coding Saurabhchand Bhati Jesús Villalba Piotr Żelasko Laureano Moro Velázquez Najim Dehak SSL 53 22 0 05 Oct 2021
ZR-2021VG: Zero-Resource Speech Challenge, Visually-Grounded Language Modelling track, 2021 edition Afra Alishahia Grzegorz Chrupała Alejandrina Cristià Emmanuel Dupoux Bertrand Higy Marvin Lavechin Okko Rasanen Chen Yu 19 7 0 14 Jul 2021
Speech Representation Learning Combining Conformer CPC with Deep Cluster for the ZeroSpeech Challenge 2021 Takashi Maekaku Xuankai Chang Yuya Fujita Li-Wei Chen Shinji Watanabe Alexander I. Rudnicky 101 13 0 13 Jul 2021
Generative Spoken Language Modeling from Raw Audio Kushal Lakhotia Evgeny Kharitonov Wei-Ning Hsu Yossi Adi Adam Polyak ... Tu Nguyen Jade Copet Alexei Baevski A. Mohamed Emmanuel Dupoux AuLLM 174 336 0 01 Feb 2021
Evaluating the reliability of acoustic speech embeddings Robin Algayres Mohamed Salah Zaiem Benoît Sagot Emmanuel Dupoux 22 28 0 27 Jul 2020