ZET-Speech: Zero-shot adaptive Emotion-controllable Text-to-Speech
Synthesis with Diffusion and Style-based Models

ZET-Speech: Zero-shot adaptive Emotion-controllable Text-to-Speech Synthesis with Diffusion and Style-based Models

23 May 2023

Papers citing "ZET-Speech: Zero-shot adaptive Emotion-controllable Text-to-Speech Synthesis with Diffusion and Style-based Models"

14 / 14 papers shown

Title
Voice Cloning: Comprehensive Survey Hussam Azzuni Abdulmotaleb El Saddik VLM 32 0 0 01 May 2025
EmoVoice: LLM-based Emotional Text-To-Speech Model with Freestyle Text Prompting Guanrou Yang Chen Yang Qian Chen Ziyang Ma Wenxi Chen ... Fan Yu Zhihao Du Zhifu Gao Shiliang Zhang Xie Chen AuLLM 53 0 0 17 Apr 2025
A Review of Human Emotion Synthesis Based on Generative Technology Fei Ma Y. Li Yifan Xie Y. He Y. Zhang ... Z. Liu Wei Yao Fuji Ren Fei Richard Yu Shiguang Ni 76 1 0 10 Dec 2024
EmoKnob: Enhance Voice Cloning with Fine-Grained Emotion Control Haozhe Chen Run Chen Julia Hirschberg 21 3 0 01 Oct 2024
Exploring synthetic data for cross-speaker style transfer in style representation based TTS Lucas Ueda Leonardo B. de M. M. Marques Flávio O. Simões Mário Uliani Neto Fernando Runstein Bianca Dal Bó Paula D. P. Costa 21 0 0 25 Sep 2024
StyleFusion TTS: Multimodal Style-control and Enhanced Feature Fusion for Zero-shot Text-to-speech Synthesis Zhiyong Chen Xinnuo Li Zhiqi Ai Shugong Xu DiffM 34 1 0 24 Sep 2024
Enhancing Emotional Text-to-Speech Controllability with Natural Language Guidance through Contrastive Learning and Diffusion Models Xin Jing Kun Zhou Andreas Triantafyllopoulos Björn W. Schuller DiffM 27 3 0 10 Sep 2024
EmoSphere-TTS: Emotional Style and Intensity Modeling via Spherical Emotion Vector for Controllable Emotional Text-to-Speech Deok-Hyeon Cho Hyung-Seok Oh Seung-Bin Kim Sang-Hoon Lee Seong-Whan Lee 29 6 0 12 Jun 2024
U-Style: Cascading U-nets with Multi-level Speaker and Style Modeling for Zero-Shot Voice Cloning Tao Li Zhichao Wang Xinfa Zhu Jian Cong Qiao Tian Yuping Wang Lei Xie DiffM 25 3 0 06 Oct 2023
On the Design Fundamentals of Diffusion Models: A Survey Ziyi Chang G. Koulieris Hubert P. H. Shum DiffM 27 52 0 07 Jun 2023
Cross-speaker Emotion Transfer Based On Prosody Compensation for End-to-End Speech Synthesis Tao Li Xinsheng Wang Qicong Xie Zhichao Wang Ming Jiang Linfu Xie 19 15 0 04 Jul 2022
Guided-TTS 2: A Diffusion Model for High-quality Adaptive Text-to-Speech with Untranscribed Data Sungwon Kim Heeseung Kim Sung-Hoon Yoon DiffM 196 52 0 30 May 2022
A Style-Based Generator Architecture for Generative Adversarial Networks Tero Karras S. Laine Timo Aila 262 10,320 0 12 Dec 2018
Domain-Adversarial Training of Neural Networks Yaroslav Ganin E. Ustinova Hana Ajakan Pascal Germain Hugo Larochelle François Laviolette M. Marchand Victor Lempitsky GAN OOD 149 9,316 0 28 May 2015