ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2211.09496
  4. Cited By
EmoDiff: Intensity Controllable Emotional Text-to-Speech with Soft-Label
  Guidance

EmoDiff: Intensity Controllable Emotional Text-to-Speech with Soft-Label Guidance

17 November 2022
Yiwei Guo
Chenpeng Du
Xie Chen
K. Yu
    DiffM
ArXivPDFHTML

Papers citing "EmoDiff: Intensity Controllable Emotional Text-to-Speech with Soft-Label Guidance"

27 / 27 papers shown
Title
EmoVoice: LLM-based Emotional Text-To-Speech Model with Freestyle Text Prompting
EmoVoice: LLM-based Emotional Text-To-Speech Model with Freestyle Text Prompting
Guanrou Yang
Chen Yang
Qian Chen
Ziyang Ma
Wenxi Chen
...
Fan Yu
Zhihao Du
Zhifu Gao
Shiliang Zhang
Xie Chen
AuLLM
53
0
0
17 Apr 2025
Shushing! Let's Imagine an Authentic Speech from the Silent Video
Shushing! Let's Imagine an Authentic Speech from the Silent Video
Jiaxin Ye
Hongming Shan
DiffM
VGen
61
1
0
19 Mar 2025
EmoDubber: Towards High Quality and Emotion Controllable Movie Dubbing
EmoDubber: Towards High Quality and Emotion Controllable Movie Dubbing
Gaoxiang Cong
Jiadong Pan
Liang-Sheng Li
Yuankai Qi
Yuxin Peng
A. Hengel
Jian Yang
Qingming Huang
90
6
0
12 Dec 2024
A Review of Human Emotion Synthesis Based on Generative Technology
A Review of Human Emotion Synthesis Based on Generative Technology
Fei Ma
Y. Li
Yifan Xie
Y. He
Y. Zhang
...
Z. Liu
Wei Yao
Fuji Ren
Fei Richard Yu
Shiguang Ni
76
1
0
10 Dec 2024
Facial Expression-Enhanced TTS: Combining Face Representation and
  Emotion Intensity for Adaptive Speech
Facial Expression-Enhanced TTS: Combining Face Representation and Emotion Intensity for Adaptive Speech
Yunji Chu
Yunseob Shim
Unsang Park
18
0
0
24 Sep 2024
Emo-DPO: Controllable Emotional Speech Synthesis through Direct
  Preference Optimization
Emo-DPO: Controllable Emotional Speech Synthesis through Direct Preference Optimization
Xiaoxue Gao
Chen Zhang
Yiming Chen
Huayun Zhang
Nancy F. Chen
32
6
0
16 Sep 2024
Laugh Now Cry Later: Controlling Time-Varying Emotional States of
  Flow-Matching-Based Zero-Shot Text-to-Speech
Laugh Now Cry Later: Controlling Time-Varying Emotional States of Flow-Matching-Based Zero-Shot Text-to-Speech
Haibin Wu
Xiaofei Wang
Sefik Emre Eskimez
Manthan Thakker
Daniel Tompkins
...
Canrun Li
Zhen Xiao
Sheng Zhao
Jinyu Li
Naoyuki Kanda
20
6
0
17 Jul 2024
Articulatory Phonetics Informed Controllable Expressive Speech Synthesis
Articulatory Phonetics Informed Controllable Expressive Speech Synthesis
Zehua Kcriss Li
Meiying Melissa Chen
Yi Zhong
Pinxin Liu
Zhiyao Duan
26
0
0
15 Jun 2024
RSET: Remapping-based Sorting Method for Emotion Transfer Speech
  Synthesis
RSET: Remapping-based Sorting Method for Emotion Transfer Speech Synthesis
Haoxiang Shi
Jianzong Wang
Xulong Zhang
Ning Cheng
Jun Yu
Jing Xiao
28
2
0
27 May 2024
AUTODIFF: Autoregressive Diffusion Modeling for Structure-based Drug
  Design
AUTODIFF: Autoregressive Diffusion Modeling for Structure-based Drug Design
Xinze Li
Penglei Wang
Tianfan Fu
Wenhao Gao
Chengtao Li
Leilei Shi
Junhong Liu
33
2
0
02 Apr 2024
Humane Speech Synthesis through Zero-Shot Emotion and Disfluency
  Generation
Humane Speech Synthesis through Zero-Shot Emotion and Disfluency Generation
Rohan Chaudhury
Mihir Godbole
Aakash Garg
Jinsil Hwaryoung Seo
25
0
0
31 Mar 2024
On the Semantic Latent Space of Diffusion-Based Text-to-Speech Models
On the Semantic Latent Space of Diffusion-Based Text-to-Speech Models
Miri Varshavsky-Hassid
Roy Hirsch
Regev Cohen
Tomer Golany
Daniel Freedman
Ehud Rivlin
23
3
0
19 Feb 2024
ED-TTS: Multi-Scale Emotion Modeling using Cross-Domain Emotion
  Diarization for Emotional Speech Synthesis
ED-TTS: Multi-Scale Emotion Modeling using Cross-Domain Emotion Diarization for Emotional Speech Synthesis
Haobin Tang
Xulong Zhang
Ning Cheng
Jing Xiao
Jianzong Wang
13
10
0
16 Jan 2024
DurFlex-EVC: Duration-Flexible Emotional Voice Conversion Leveraging Discrete Representations without Text Alignment
DurFlex-EVC: Duration-Flexible Emotional Voice Conversion Leveraging Discrete Representations without Text Alignment
Hyoung-Seok Oh
Sang-Hoon Lee
Deok-Hyun Cho
Seong-Whan Lee
34
1
0
16 Jan 2024
StyleCap: Automatic Speaking-Style Captioning from Speech Based on
  Speech and Language Self-supervised Learning Models
StyleCap: Automatic Speaking-Style Captioning from Speech Based on Speech and Language Self-supervised Learning Models
Kazuki Yamauchi
Yusuke Ijima
Yuki Saito
17
8
0
28 Nov 2023
Expressive TTS Driven by Natural Language Prompts Using Few Human
  Annotations
Expressive TTS Driven by Natural Language Prompts Using Few Human Annotations
Hanglei Zhang
Yiwei Guo
Sen Liu
Xie Chen
Kai Yu
17
0
0
02 Nov 2023
Leveraging Speech PTM, Text LLM, and Emotional TTS for Speech Emotion
  Recognition
Leveraging Speech PTM, Text LLM, and Emotional TTS for Speech Emotion Recognition
Ziyang Ma
Wen Wu
Zhisheng Zheng
Yiwei Guo
Qian Chen
Shiliang Zhang
Xie Chen
16
14
0
19 Sep 2023
EMOCONV-DIFF: Diffusion-based Speech Emotion Conversion for Non-parallel
  and In-the-wild Data
EMOCONV-DIFF: Diffusion-based Speech Emotion Conversion for Non-parallel and In-the-wild Data
N. Prabhu
Bunlong Lay
Simon Welker
N. Lehmann-Willenbrock
Timo Gerkmann
DiffM
14
3
0
14 Sep 2023
VoiceFlow: Efficient Text-to-Speech with Rectified Flow Matching
VoiceFlow: Efficient Text-to-Speech with Rectified Flow Matching
Yiwei Guo
Chenpeng Du
Ziyang Ma
Xie Chen
K. Yu
DiffM
23
36
0
10 Sep 2023
Matcha-TTS: A fast TTS architecture with conditional flow matching
Matcha-TTS: A fast TTS architecture with conditional flow matching
Shivam Mehta
Ruibo Tu
Jonas Beskow
Éva Székely
G. Henter
14
68
0
06 Sep 2023
AffectEcho: Speaker Independent and Language-Agnostic Emotion and Affect
  Transfer for Speech Synthesis
AffectEcho: Speaker Independent and Language-Agnostic Emotion and Affect Transfer for Speech Synthesis
Hrishikesh Viswanath
Aneesh Bhattacharya
Pascal Jutras-Dubé
Prerit Gupta
Mridu Prashanth
Yashvardhan Khaitan
Aniket Bera
11
0
0
16 Aug 2023
EmoSpeech: Guiding FastSpeech2 Towards Emotional Text to Speech
EmoSpeech: Guiding FastSpeech2 Towards Emotional Text to Speech
Daria Diatlova
V. Shutov
13
7
0
28 Jun 2023
CASEIN: Cascading Explicit and Implicit Control for Fine-grained Emotion
  Intensity Regulation
CASEIN: Cascading Explicit and Implicit Control for Fine-grained Emotion Intensity Regulation
Yuhao Cui
Xiongwei Wang
Zhongzhou Zhao
Wei Zhou
Haiqing Chen
20
1
0
27 Jun 2023
DSE-TTS: Dual Speaker Embedding for Cross-Lingual Text-to-Speech
DSE-TTS: Dual Speaker Embedding for Cross-Lingual Text-to-Speech
Sen Liu
Yiwei Guo
Chenpeng Du
Xie Chen
Kai Yu
13
6
0
25 Jun 2023
EmoMix: Emotion Mixing via Diffusion Models for Emotional Speech
  Synthesis
EmoMix: Emotion Mixing via Diffusion Models for Emotional Speech Synthesis
Haobin Tang
Xulong Zhang
Jianzong Wang
Ning Cheng
Jing Xiao
DiffM
14
22
0
01 Jun 2023
ZET-Speech: Zero-shot adaptive Emotion-controllable Text-to-Speech
  Synthesis with Diffusion and Style-based Models
ZET-Speech: Zero-shot adaptive Emotion-controllable Text-to-Speech Synthesis with Diffusion and Style-based Models
Minki Kang
Wooseok Han
S. Hwang
Eunho Yang
DiffM
15
15
0
23 May 2023
A Survey on Audio Diffusion Models: Text To Speech Synthesis and
  Enhancement in Generative AI
A Survey on Audio Diffusion Models: Text To Speech Synthesis and Enhancement in Generative AI
Chenshuang Zhang
Chaoning Zhang
Sheng Zheng
Mengchun Zhang
Maryam Qamar
Sung-Ho Bae
In So Kweon
DiffM
MedIm
39
64
0
23 Mar 2023
1