ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1906.10859
  4. Cited By
End-to-End Emotional Speech Synthesis Using Style Tokens and
  Semi-Supervised Training

End-to-End Emotional Speech Synthesis Using Style Tokens and Semi-Supervised Training

26 June 2019
Peng Wu
Zhenhua Ling
Li-Juan Liu
Yuan Jiang
Hong-Chuan Wu
Lirong Dai
ArXiv (abs)PDFHTML

Papers citing "End-to-End Emotional Speech Synthesis Using Style Tokens and Semi-Supervised Training"

35 / 35 papers shown
Title
AMNet: An Acoustic Model Network for Enhanced Mandarin Speech Synthesis
AMNet: An Acoustic Model Network for Enhanced Mandarin Speech Synthesis
Yubing Cao
Yinfeng Yu
Yongming Li
Liejun Wang
67
0
0
12 Apr 2025
A Review of Human Emotion Synthesis Based on Generative Technology
A Review of Human Emotion Synthesis Based on Generative Technology
Fei Ma
Yongqian Li
Yifan Xie
Y. He
Yize Zhang
...
Z. Liu
Wei Yao
Fuji Ren
Fei Richard Yu
Shiguang Ni
123
2
0
10 Dec 2024
An Attribute Interpolation Method in Speech Synthesis by Model Merging
An Attribute Interpolation Method in Speech Synthesis by Model Merging
Masato Murata
Koichi Miyazaki
Tomoki Koriyama
MoMe
115
6
0
30 Jun 2024
Exploring speech style spaces with language models: Emotional TTS
  without emotion labels
Exploring speech style spaces with language models: Emotional TTS without emotion labels
Shreeram Suresh Chandra
Zongyang Du
Berrak Sisman
76
2
0
18 May 2024
Code-Mixed Text to Speech Synthesis under Low-Resource Constraints
Code-Mixed Text to Speech Synthesis under Low-Resource Constraints
Raviraj Joshi
Nikesh Garera
79
0
0
02 Dec 2023
Improving severity preservation of healthy-to-pathological voice
  conversion with global style tokens
Improving severity preservation of healthy-to-pathological voice conversion with global style tokens
B. Halpern
Wen-Chin Huang
Lester Phillip Violeta
R.J.J.H. van Son
Tomoki Toda
125
2
0
04 Oct 2023
MSM-VC: High-fidelity Source Style Transfer for Non-Parallel Voice
  Conversion by Multi-scale Style Modeling
MSM-VC: High-fidelity Source Style Transfer for Non-Parallel Voice Conversion by Multi-scale Style Modeling
Zhichao Wang
Xinsheng Wang
Qicong Xie
Tao Li
Linfu Xie
Qiao Tian
Yuping Wang
114
4
0
03 Sep 2023
MSStyleTTS: Multi-Scale Style Modeling with Hierarchical Context
  Information for Expressive Speech Synthesis
MSStyleTTS: Multi-Scale Style Modeling with Hierarchical Context Information for Expressive Speech Synthesis
Shunwei Lei
Yixuan Zhou
Liyang Chen
Zhiyong Wu
Xixin Wu
Shiyin Kang
Helen Meng
87
7
0
29 Jul 2023
PromptStyle: Controllable Style Transfer for Text-to-Speech with Natural
  Language Descriptions
PromptStyle: Controllable Style Transfer for Text-to-Speech with Natural Language Descriptions
Guanghou Liu
Yongmao Zhang
Yinjiao Lei
Yunlin Chen
Rui Wang
Zhifei Li
Linfu Xie
70
42
0
31 May 2023
GTN-Bailando: Genre Consistent Long-Term 3D Dance Generation based on
  Pre-trained Genre Token Network
GTN-Bailando: Genre Consistent Long-Term 3D Dance Generation based on Pre-trained Genre Token Network
Hao-Wen Zhuang
Shunwei Lei
Long Xiao
Weiqing Li
Liyang Chen
Sicheng Yang
Zhiyong Wu
Shiyin Kang
Helen Meng
78
14
0
25 Apr 2023
Cross-speaker Emotion Transfer by Manipulating Speech Style Latents
Cross-speaker Emotion Transfer by Manipulating Speech Style Latents
Suhee Jo
Younggun Lee
Yookyung Shin
Yeongtae Hwang
Taesu Kim
55
4
0
15 Mar 2023
Semi-supervised learning for continuous emotional intensity controllable
  speech synthesis with disentangled representations
Semi-supervised learning for continuous emotional intensity controllable speech synthesis with disentangled representations
Yoorim Oh
Juheon Lee
Yoseob Han
Kyogu Lee
67
3
0
11 Nov 2022
An Overview of Affective Speech Synthesis and Conversion in the Deep
  Learning Era
An Overview of Affective Speech Synthesis and Conversion in the Deep Learning Era
Andreas Triantafyllopoulos
Björn W. Schuller
Gokcce .Iymen
M. Sezgin
Xiangheng He
...
Shuo Liu
Silvan Mertes
Elisabeth André
Ruibo Fu
Jianhua Tao
115
57
0
06 Oct 2022
Speech Synthesis with Mixed Emotions
Speech Synthesis with Mixed Emotions
Kun Zhou
Berrak Sisman
R. Rana
B.W.Schuller
Haizhou Li
87
47
0
11 Aug 2022
Language Model-Based Emotion Prediction Methods for Emotional Speech
  Synthesis Systems
Language Model-Based Emotion Prediction Methods for Emotional Speech Synthesis Systems
Hyun-Wook Yoon
Ohsung Kwon
Hoyeon Lee
Ryuichi Yamamoto
Eunwoo Song
Jae-Min Kim
Min-Jae Hwang
128
15
0
30 Jun 2022
iEmoTTS: Toward Robust Cross-Speaker Emotion Transfer and Control for
  Speech Synthesis based on Disentanglement between Prosody and Timbre
iEmoTTS: Toward Robust Cross-Speaker Emotion Transfer and Control for Speech Synthesis based on Disentanglement between Prosody and Timbre
Guangyan Zhang
Ying Qin
Weinan Zhang
Jialun Wu
Mei Li
Yu Gai
Feijun Jiang
Tan Lee
108
27
0
29 Jun 2022
Self-supervised Context-aware Style Representation for Expressive Speech
  Synthesis
Self-supervised Context-aware Style Representation for Expressive Speech Synthesis
Yihan Wu
Xi Wang
S. Zhang
Lei He
Ruihua Song
J. Nie
102
15
0
25 Jun 2022
Towards Multi-Scale Speaking Style Modelling with Hierarchical Context
  Information for Mandarin Speech Synthesis
Towards Multi-Scale Speaking Style Modelling with Hierarchical Context Information for Mandarin Speech Synthesis
Shunwei Lei
Yixuan Zhou
Liyang Chen
Jiankun Hu
Zhiyong Wu
Shiyin Kang
Helen Meng
79
10
0
06 Apr 2022
On incorporating social speaker characteristics in synthetic speech
On incorporating social speaker characteristics in synthetic speech
S. Rallabandi
Sebastian Möller
86
0
0
03 Apr 2022
MuSE-SVS: Multi-Singer Emotional Singing Voice Synthesizer that Controls
  Emotional Intensity
MuSE-SVS: Multi-Singer Emotional Singing Voice Synthesizer that Controls Emotional Intensity
Sungjae Kim
Y.E. Kim
Jewoo Jun
Injung Kim
107
14
0
02 Mar 2022
Disentangling Style and Speaker Attributes for TTS Style Transfer
Disentangling Style and Speaker Attributes for TTS Style Transfer
Xiaochun An
Frank Soong
Lei Xie
155
18
0
24 Jan 2022
MsEmoTTS: Multi-scale emotion transfer, prediction, and control for
  emotional speech synthesis
MsEmoTTS: Multi-scale emotion transfer, prediction, and control for emotional speech synthesis
Yinjiao Lei
Shan Yang
Xinsheng Wang
Lei Xie
79
75
0
17 Jan 2022
Improving Emotional Speech Synthesis by Using SUS-Constrained VAE and
  Text Encoder Aggregation
Improving Emotional Speech Synthesis by Using SUS-Constrained VAE and Text Encoder Aggregation
Fengyu Yang
Jian Luan
Yujun Wang
137
5
0
19 Oct 2021
Cross-speaker Emotion Transfer Based on Speaker Condition Layer
  Normalization and Semi-Supervised Training in Text-To-Speech
Cross-speaker Emotion Transfer Based on Speaker Condition Layer Normalization and Semi-Supervised Training in Text-To-Speech
Pengfei Wu
Junjie Pan
Chenchang Xu
Junhui Zhang
Lin Wu
Xiang Yin
Zejun Ma
62
16
0
08 Oct 2021
Cross-speaker emotion disentangling and transfer for end-to-end speech
  synthesis
Cross-speaker emotion disentangling and transfer for end-to-end speech synthesis
Tao Li
Xinsheng Wang
Qicong Xie
Zhichao Wang
Linfu Xie
69
47
0
14 Sep 2021
UniTTS: Residual Learning of Unified Embedding Space for Speech Style
  Control
UniTTS: Residual Learning of Unified Embedding Space for Speech Style Control
M. Kang
Sungjae Kim
Injung Kim
77
3
0
21 Jun 2021
Improving Performance of Seen and Unseen Speech Style Transfer in
  End-to-end Neural TTS
Improving Performance of Seen and Unseen Speech Style Transfer in End-to-end Neural TTS
Xiaochun An
Frank Soong
Lei Xie
119
9
0
18 Jun 2021
EMOVIE: A Mandarin Emotion Speech Dataset with a Simple Emotional
  Text-to-Speech Model
EMOVIE: A Mandarin Emotion Speech Dataset with a Simple Emotional Text-to-Speech Model
Chenye Cui
Yi Ren
Jinglin Liu
Feiyang Chen
Rongjie Huang
Ming Lei
Zhou Zhao
66
35
0
17 Jun 2021
Towards Multi-Scale Style Control for Expressive Speech Synthesis
Towards Multi-Scale Style Control for Expressive Speech Synthesis
Xiang Li
Changhe Song
Jingbei Li
Zhiyong Wu
Jia Jia
Helen Meng
64
47
0
08 Apr 2021
Reinforcement Learning for Emotional Text-to-Speech Synthesis with
  Improved Emotion Discriminability
Reinforcement Learning for Emotional Text-to-Speech Synthesis with Improved Emotion Discriminability
Rui Liu
Berrak Sisman
Haizhou Li
69
32
0
03 Apr 2021
Investigating on Incorporating Pretrained and Learnable Speaker
  Representations for Multi-Speaker Multi-Style Text-to-Speech
Investigating on Incorporating Pretrained and Learnable Speaker Representations for Multi-Speaker Multi-Style Text-to-Speech
C. Chien
Jheng-hao Lin
Chien-yu Huang
Po-Chun Hsu
Hung-yi Lee
119
70
0
06 Mar 2021
Emotion controllable speech synthesis using emotion-unlabeled dataset
  with the assistance of cross-domain speech emotion recognition
Emotion controllable speech synthesis using emotion-unlabeled dataset with the assistance of cross-domain speech emotion recognition
Xiong Cai
Dongyang Dai
Zhiyong Wu
Xiang Li
Jingbei Li
Helen Meng
94
67
0
26 Oct 2020
Expressive TTS Training with Frame and Style Reconstruction Loss
Expressive TTS Training with Frame and Style Reconstruction Loss
Rui Liu
Berrak Sisman
Guanglai Gao
Haizhou Li
112
73
0
04 Aug 2020
Multi-Reference Neural TTS Stylization with Adversarial Cycle
  Consistency
Multi-Reference Neural TTS Stylization with Adversarial Cycle Consistency
M. Whitehill
Shuang Ma
Daniel J. McDuff
Yale Song
111
35
0
25 Oct 2019
Semi-Supervised Generative Modeling for Controllable Speech Synthesis
Semi-Supervised Generative Modeling for Controllable Speech Synthesis
Raza Habib
Soroosh Mariooryad
Matt Shannon
Eric Battenberg
RJ Skerry-Ryan
Daisy Stanton
David Kao
Tom Bagby
BDL
68
48
0
03 Oct 2019
1