v1v2 (latest)

VQVAE Unsupervised Unit Discovery and Multi-scale Code2Spec Inverter for Zerospeech Challenge 2019

27 May 2019

Haizhou Li

Papers citing "VQVAE Unsupervised Unit Discovery and Multi-scale Code2Spec Inverter for Zerospeech Challenge 2019"

49 / 49 papers shown

Title
Arch-LLM: Taming LLMs for Neural Architecture Generation via Unsupervised Discrete Representation Learning Deshani Geethika Poddenige Sachith Seneviratne Damith A. Senanayake Mahesan Niranjan PN Suganthan Saman K. Halgamuge 82 0 0 28 Mar 2025
LAST: Language Model Aware Speech Tokenization A. Turetzky Yossi Adi 83 3 0 05 Sep 2024
Capsule Enhanced Variational AutoEncoder for Underwater Image Reconstruction Rita Pucci N. Martinel 65 1 0 03 Jun 2024
A Survey of Deep Learning Audio Generation Methods Matej Bozic Marko Horvat VLM MedIm 104 2 0 31 May 2024
Converting Anyone's Voice: End-to-End Expressive Voice Conversion with a Conditional Diffusion Model Zongyang Du Junchen Lu Kun Zhou Lakshmish Kaushik Berrak Sisman 104 1 0 02 May 2024
Exploratory Evaluation of Speech Content Masking Jennifer Williams Karla Pizzi Paul-Gauthier Noé Sneha Das 70 3 0 08 Jan 2024
Acoustic BPE for Speech Generation with Discrete Tokens Feiyu Shen Yiwei Guo Chenpeng Du Xie Chen Kai Yu 95 13 0 23 Oct 2023
Generative Spoken Language Model based on continuous word-sized audio tokens Robin Algayres Yossi Adi Tu Nguyen Jade Copet Gabriel Synnaeve Benoît Sagot Emmanuel Dupoux AuLLM 119 16 0 08 Oct 2023
Textually Pretrained Speech Language Models Michael Hassid Tal Remez Tu Nguyen Itai Gat Alexis Conneau ... Alexandre Défossez Gabriel Synnaeve Emmanuel Dupoux Roy Schwartz Yossi Adi VLM SyDa 131 61 0 22 May 2023
UW-CVGAN: UnderWater Image Enhancement with Capsules Vectors Quantization Rita Pucci C. Micheloni N. Martinel 61 0 0 02 Feb 2023
Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers Chengyi Wang Sanyuan Chen Yu-Huan Wu Zi-Hua Zhang Long Zhou ... Huaming Wang Jinyu Li Lei He Sheng Zhao Furu Wei 193 727 0 05 Jan 2023
Augmentation Invariant Discrete Representation for Generative Spoken Language Modeling Itai Gat Felix Kreuk Tu Nguyen Ann Lee Jade Copet Gabriel Synnaeve Emmanuel Dupoux Yossi Adi 75 11 0 30 Sep 2022
An Initial study on Birdsong Re-synthesis Using Neural Vocoders Rhythm Bhatia Tomi Kinnunen 51 1 0 21 Sep 2022
ASR2K: Speech Recognition for Around 2000 Languages without Audio Xinjian Li Florian Metze David R. Mortensen A. Black Shinji Watanabe 59 28 0 06 Sep 2022
A Comparative Study of Self-supervised Speech Representation Based Voice Conversion Wen-Chin Huang Shu-Wen Yang Tomoki Hayashi Tomoki Toda 66 17 0 10 Jul 2022
Self-supervised speech unit discovery from articulatory and acoustic features using VQ-VAE Marc-Antoine Georges J. Schwartz Thomas Hueber SSL 114 5 0 17 Jun 2022
Self-Supervised Speech Representation Learning: A Review Abdel-rahman Mohamed Hung-yi Lee Lasse Borgholt Jakob Drachmann Havtorn Joakim Edin ... Shang-Wen Li Karen Livescu Lars Maaløe Tara N. Sainath Shinji Watanabe SSL AI4TS 291 368 0 21 May 2022
Discovering Phonetic Inventories with Crosslingual Automatic Speech Recognition Piotr Żelasko Siyuan Feng Laureano Moro-Velazquez A. Abavisani Saurabhchand Bhati O. Scharenborg M. Hasegawa-Johnson Najim Dehak 105 16 0 26 Jan 2022
Disentanglement of Emotional Style and Speaker Identity for Expressive Voice Conversion Zongyang Du Berrak Sisman Kun Zhou Haizhou Li 93 24 0 20 Oct 2021
CLSRIL-23: Cross Lingual Speech Representations for Indic Languages Anirudh Gupta Harveen Singh Chadha Priyanshi Shah Neeraj Chimmwal Ankur Dhuriya Rishabh Gaur Vivek Raghavan 69 37 0 15 Jul 2021
A Survey on Neural Speech Synthesis Xu Tan Tao Qin Frank Soong Tie-Yan Liu AI4TS 133 359 0 29 Jun 2021
Preliminary study on using vector quantization latent spaces for TTS/VC systems with consistent performance Hieu-Thi Luong Junichi Yamagishi 85 0 0 25 Jun 2021
Discrete representations in neural models of spoken language Bertrand Higy Lieke Gelderloos Afra Alishahi Grzegorz Chrupała 140 6 0 12 May 2021
Voice Conversion Based Speaker Normalization for Acoustic Unit Discovery Thomas Glarner Janek Ebbers Reinhold Häb-Umbach DRL 20 1 0 04 May 2021
Generative Spoken Language Modeling from Raw Audio Kushal Lakhotia Evgeny Kharitonov Wei-Ning Hsu Yossi Adi Adam Polyak ... Tu Nguyen Jade Copet Alexei Baevski A. Mohamed Emmanuel Dupoux AuLLM 292 366 0 01 Feb 2021
Text-Free Image-to-Speech Synthesis Using Learned Segmental Units Wei-Ning Hsu David Harwath Christopher Song James R. Glass CLIP 90 67 0 31 Dec 2020
The effectiveness of unsupervised subword modeling with autoregressive and cross-lingual phone-aware networks Siyuan Feng O. Scharenborg SSL 54 3 0 17 Dec 2020
Unsupervised Learning of Disentangled Speech Content and Style Representation Andros Tjandra Ruoming Pang Yu Zhang Shigeki Karita BDL DRL 73 15 0 24 Oct 2020
End-to-End Text-to-Speech using Latent Duration based on VQ-VAE Yusuke Yasuda Xin Wang Junichi Yamagishi 68 17 0 19 Oct 2020
Exploration of End-to-end Synthesisers forZero Resource Speech Challenge 2020 Karthik Pandia D.S. Anusha Prakash M. M. H. Murthy 42 4 0 10 Sep 2020
Unsupervised Acoustic Unit Representation Learning for Voice Conversion using WaveNet Auto-encoders Mingjie Chen Thomas Hain SSL DRL 54 15 0 16 Aug 2020
Pretraining Techniques for Sequence-to-Sequence Voice Conversion Wen-Chin Huang Tomoki Hayashi Yi-Chiao Wu Hirokazu Kameoka Tomoki Toda 118 40 0 07 Aug 2020
Unsupervised Subword Modeling Using Autoregressive Pretraining and Cross-Lingual Phone-Aware Modeling Siyuan Feng O. Scharenborg SSL 72 4 0 25 Jul 2020
Incorporating Reinforced Adversarial Learning in Autoregressive Image Generation Kenan E. Ak N. Xu Zhe Lin Yilin Wang 85 13 0 20 Jul 2020
Unsupervised Cross-lingual Representation Learning for Speech Recognition Alexis Conneau Alexei Baevski R. Collobert Abdel-rahman Mohamed Michael Auli SSL 171 782 0 24 Jun 2020
wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations Alexei Baevski Henry Zhou Abdel-rahman Mohamed Michael Auli SSL 325 5,874 0 20 Jun 2020
UWSpeech: Speech to Speech Translation for Unwritten Languages Chen Zhang Xu Tan Yi Ren Tao Qin Ke-jun Zhang Tie-Yan Liu 49 56 0 14 Jun 2020
Improving Unsupervised Sparsespeech Acoustic Models with Categorical Reparameterization Benjamin Milde Christian Biemann 23 1 0 29 May 2020
Transformer VQ-VAE for Unsupervised Unit Discovery and Speech Synthesis: ZeroSpeech 2020 Challenge Andros Tjandra S. Sakti Satoshi Nakamura 67 39 0 24 May 2020
Vector-quantized neural networks for acoustic unit discovery in the ZeroSpeech 2020 challenge Benjamin van Niekerk Leanne Nortje Herman Kamper 120 117 0 19 May 2020
Bayesian Subspace HMM for the Zerospeech 2020 Challenge Bolaji Yusuf Lucas Ondel BDL 54 0 0 19 May 2020
Robust Training of Vector Quantized Bottleneck Models A. Lancucki J. Chorowski Guillaume Sanchez R. Marxer Nanxin Chen Hans J. G. A. Dolfing Sameer Khurana Tanel Alumäe Antoine Laurent 86 60 0 18 May 2020
Improved Prosody from Learned F0 Codebook Representations for VQ-VAE Speech Waveform Reconstruction Yi Zhao Haoyu Li Cheng-I Jeff Lai Jennifer Williams Erica Cooper Junichi Yamagishi 84 18 0 16 May 2020
Exploring TTS without T Using Biologically/Psychologically Motivated Neural Network Modules (ZeroSpeech 2020) Takashi Morita H. Koda 69 8 0 11 May 2020
WaveTTS: Tacotron-based TTS with Joint Time-Frequency Domain Loss Rui Liu Berrak Sisman F. Bao Guanglai Gao Haizhou Li 125 14 0 02 Feb 2020
vq-wav2vec: Self-Supervised Learning of Discrete Speech Representations Alexei Baevski Steffen Schneider Michael Auli SSL 187 668 0 12 Oct 2019
Speech-to-speech Translation between Untranscribed Unknown Languages Andros Tjandra S. Sakti Satoshi Nakamura 64 49 0 02 Oct 2019
Bootstrapping non-parallel voice conversion from speaker-adaptive text-to-speech Hieu-Thi Luong Junichi Yamagishi 81 17 0 14 Sep 2019
The Zero Resource Speech Challenge 2019: TTS without T Ewan Dunbar Robin Algayres Julien Karadayi Mathieu Bernard Juan Benjumea ... Lucas Ondel A. Black Laurent Besacier S. Sakti Emmanuel Dupoux 100 117 0 25 Apr 2019