iEmoTTS: Toward Robust Cross-Speaker Emotion Transfer and Control for
Speech Synthesis based on Disentanglement between Prosody and Timbre

iEmoTTS: Toward Robust Cross-Speaker Emotion Transfer and Control for Speech Synthesis based on Disentanglement between Prosody and Timbre

29 June 2022

Papers citing "iEmoTTS: Toward Robust Cross-Speaker Emotion Transfer and Control for Speech Synthesis based on Disentanglement between Prosody and Timbre"

14 / 14 papers shown

Title
Voice Cloning: Comprehensive Survey Hussam Azzuni Abdulmotaleb El Saddik VLM 32 0 0 01 May 2025
A Review of Human Emotion Synthesis Based on Generative Technology Fei Ma Y. Li Yifan Xie Y. He Y. Zhang ... Z. Liu Wei Yao Fuji Ren Fei Richard Yu Shiguang Ni 76 1 0 10 Dec 2024
EmoSphere++: Emotion-Controllable Zero-Shot Text-to-Speech via Emotion-Adaptive Spherical Vector Deok-Hyeon Cho Hyung-Seok Oh Seung-Bin Kim Seong-Whan Lee 39 3 0 04 Nov 2024
Facial Expression-Enhanced TTS: Combining Face Representation and Emotion Intensity for Adaptive Speech Yunji Chu Yunseob Shim Unsang Park 20 0 0 24 Sep 2024
VoiceShop: A Unified Speech-to-Speech Framework for Identity-Preserving Zero-Shot Voice Editing Philip Anastassiou Zhenyu Tang Kainan Peng Dongya Jia Jiaxin Li Ming Tu Yuping Wang Yuxuan Wang Mingbo Ma 37 4 0 10 Apr 2024
Zero-Shot Emotion Transfer For Cross-Lingual Speech Synthesis Yuke Li Xinfa Zhu Yinjiao Lei Hai Li Junhui Liu Danming Xie Lei Xie 22 3 0 06 Oct 2023
DiCLET-TTS: Diffusion Model based Cross-lingual Emotion Transfer for Text-to-Speech -- A Study between English and Mandarin Tao Li Chenxu Hu Jian Cong Xinfa Zhu Jingbei Li Qiao Tian Yuping Wang Linfu Xie DiffM 17 8 0 02 Sep 2023
Comparing normalizing flows and diffusion models for prosody and acoustic modelling in text-to-speech Guangyan Zhang Thomas Merritt M. Ribeiro Biel Tura Vecino K. Yanagisawa ... Ammar Abbas Piotr Bilinski Roberto Barra-Chicote Daniel Korzekwa Jaime Lorenzo-Trueba DiffM 29 3 0 31 Jul 2023
PromptStyle: Controllable Style Transfer for Text-to-Speech with Natural Language Descriptions Guanghou Liu Yongmao Zhang Yinjiao Lei Yunlin Chen Rui Wang Zhifei Li Linfu Xie 13 36 0 31 May 2023
EmoTalk: Speech-Driven Emotional Disentanglement for 3D Face Animation Ziqiao Peng Hao Wu Zhenbo Song Hao-Xuan Xu Xiangyu Zhu Jun He Hongyan Liu Zhaoxin Fan CVBM 8 99 0 20 Mar 2023
An Overview of Affective Speech Synthesis and Conversion in the Deep Learning Era Andreas Triantafyllopoulos Björn W. Schuller Gokcce .Iymen M. Sezgin Xiangheng He ... Shuo Liu Silvan Mertes Elisabeth André Ruibo Fu Jianhua Tao 15 53 0 06 Oct 2022
Controllable Accented Text-to-Speech Synthesis Rui Liu Berrak Sisman Guanglai Gao Haizhou Li 13 6 0 22 Sep 2022
Disentangling Style and Speaker Attributes for TTS Style Transfer Xiaochun An Frank Soong Lei Xie 43 18 0 24 Jan 2022
Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis Ye Jia Yu Zhang Ron J. Weiss Quan Wang Jonathan Shen ... Z. Chen Patrick Nguyen Ruoming Pang Ignacio López Moreno Yonghui Wu 201 819 0 12 Jun 2018