Zero-shot text-to-speech synthesis conditioned using self-supervised
speech representation model

Zero-shot text-to-speech synthesis conditioned using self-supervised speech representation model

24 April 2023

Takanori Ashihara

Hiroki Kanagawa

Takafumi Moriya

Papers citing "Zero-shot text-to-speech synthesis conditioned using self-supervised speech representation model"

9 / 9 papers shown

Title
Voice Cloning: Comprehensive Survey Hussam Azzuni Abdulmotaleb El Saddik VLM 32 0 0 01 May 2025
Description-based Controllable Text-to-Speech with Cross-Lingual Voice Control Ryuichi Yamamoto Yuma Shirahata Masaya Kawamura Kentaro Tachibana DiffM 32 2 0 26 Sep 2024
Text-To-Speech Synthesis In The Wild Jee-weon Jung Wangyou Zhang Soumi Maiti Yihan Wu Xin Wang ... Hye-jin Shim Nicholas W. D. Evans Joon Son Chung Shinnosuke Takamichi Shinji Watanabe 32 1 0 13 Sep 2024
Lightweight Zero-shot Text-to-Speech with Mixture of Adapters Kenichi Fujita Takanori Ashihara Marc Delcroix Yusuke Ijima 30 2 0 01 Jul 2024
Noise-robust zero-shot text-to-speech synthesis conditioned on self-supervised speech-representation model with adapters Kenichi Fujita Hiroshi Sato Takanori Ashihara Hiroki Kanagawa Marc Delcroix Takafumi Moriya Yusuke Ijima 23 8 0 10 Jan 2024
StyleCap: Automatic Speaking-Style Captioning from Speech Based on Speech and Language Self-supervised Learning Models Kazuki Yamauchi Yusuke Ijima Yuki Saito 19 8 0 28 Nov 2023
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models Yinghao Aaron Li Cong Han Vinay S. Raghavan Gavin Mischler N. Mesgarani VLM DiffM 31 107 0 13 Jun 2023
Streaming Target-Speaker ASR with Neural Transducer Takafumi Moriya Hiroshi Sato Tsubasa Ochiai Marc Delcroix T. Shinozaki 26 21 0 09 Sep 2022
GenerSpeech: Towards Style Transfer for Generalizable Out-Of-Domain Text-to-Speech Rongjie Huang Yi Ren Jinglin Liu Chenye Cui Zhou Zhao OODD VLM 115 34 0 15 May 2022