ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2008.01490
  4. Cited By
Expressive TTS Training with Frame and Style Reconstruction Loss

Expressive TTS Training with Frame and Style Reconstruction Loss

4 August 2020
Rui Liu
Berrak Sisman
Guanglai Gao
Haizhou Li
ArXivPDFHTML

Papers citing "Expressive TTS Training with Frame and Style Reconstruction Loss"

40 / 40 papers shown
Title
DubWise: Video-Guided Speech Duration Control in Multimodal LLM-based
  Text-to-Speech for Dubbing
DubWise: Video-Guided Speech Duration Control in Multimodal LLM-based Text-to-Speech for Dubbing
Neha Sahipjohn
Ashishkumar Gudmalwar
Nirmesh Shah
Pankaj Wasnik
R. Shah
43
5
0
13 Jun 2024
Exploring speech style spaces with language models: Emotional TTS
  without emotion labels
Exploring speech style spaces with language models: Emotional TTS without emotion labels
Shreeram Suresh Chandra
Zongyang Du
Berrak Sisman
38
2
0
18 May 2024
Emotion Rendering for Conversational Speech Synthesis with Heterogeneous
  Graph-Based Context Modeling
Emotion Rendering for Conversational Speech Synthesis with Heterogeneous Graph-Based Context Modeling
Rui Liu
Yifan Hu
Yi Ren
Xiang Yin
Haizhou Li
37
16
0
19 Dec 2023
FluentEditor: Text-based Speech Editing by Considering Acoustic and
  Prosody Consistency
FluentEditor: Text-based Speech Editing by Considering Acoustic and Prosody Consistency
Rui Liu
Jiatian Xi
Ziyue Jiang
Haizhou Li
9
2
0
21 Sep 2023
MSM-VC: High-fidelity Source Style Transfer for Non-Parallel Voice
  Conversion by Multi-scale Style Modeling
MSM-VC: High-fidelity Source Style Transfer for Non-Parallel Voice Conversion by Multi-scale Style Modeling
Zhichao Wang
Xinsheng Wang
Qicong Xie
Tao Li
Linfu Xie
Qiao Tian
Yuping Wang
13
4
0
03 Sep 2023
DiCLET-TTS: Diffusion Model based Cross-lingual Emotion Transfer for
  Text-to-Speech -- A Study between English and Mandarin
DiCLET-TTS: Diffusion Model based Cross-lingual Emotion Transfer for Text-to-Speech -- A Study between English and Mandarin
Tao Li
Chenxu Hu
Jian Cong
Xinfa Zhu
Jingbei Li
Qiao Tian
Yuping Wang
Linfu Xie
DiffM
24
8
0
02 Sep 2023
DiffProsody: Diffusion-based Latent Prosody Generation for Expressive
  Speech Synthesis with Prosody Conditional Adversarial Training
DiffProsody: Diffusion-based Latent Prosody Generation for Expressive Speech Synthesis with Prosody Conditional Adversarial Training
H. Oh
Sang-Hoon Lee
Seong-Whan Lee
DiffM
15
14
0
31 Jul 2023
EmoMix: Emotion Mixing via Diffusion Models for Emotional Speech
  Synthesis
EmoMix: Emotion Mixing via Diffusion Models for Emotional Speech Synthesis
Haobin Tang
Xulong Zhang
Jianzong Wang
Ning Cheng
Jing Xiao
DiffM
14
22
0
01 Jun 2023
Betray Oneself: A Novel Audio DeepFake Detection Model via
  Mono-to-Stereo Conversion
Betray Oneself: A Novel Audio DeepFake Detection Model via Mono-to-Stereo Conversion
Rui Liu
Jinhua Zhang
Guanglai Gao
Haizhou Li
18
9
0
25 May 2023
Multi-level Temporal-channel Speaker Retrieval for Zero-shot Voice
  Conversion
Multi-level Temporal-channel Speaker Retrieval for Zero-shot Voice Conversion
Zhichao Wang
Liumeng Xue
Qiuqiang Kong
Linfu Xie
Yuan-Jui Chen
Qiao Tian
Yuping Wang
BDL
9
3
0
12 May 2023
Accented Text-to-Speech Synthesis with Limited Data
Accented Text-to-Speech Synthesis with Limited Data
Xuehao Zhou
Mingyang Zhang
Yi Zhou
Zhizheng Wu
Haizhou Li
29
11
0
08 May 2023
Time out of Mind: Generating Rate of Speech conditioned on emotion and
  speaker
Time out of Mind: Generating Rate of Speech conditioned on emotion and speaker
Navjot Kaur
Paige Tuttosi
16
2
0
29 Jan 2023
Delivering Speaking Style in Low-resource Voice Conversion with
  Multi-factor Constraints
Delivering Speaking Style in Low-resource Voice Conversion with Multi-factor Constraints
Zhichao Wang
Xinsheng Wang
Linfu Xie
Yuan-Jui Chen
Qiao Tian
Yuping Wang
22
5
0
16 Nov 2022
Explicit Intensity Control for Accented Text-to-speech
Explicit Intensity Control for Accented Text-to-speech
Rui Liu
Haolin Zuo
De Hu
Guanglai Gao
Haizhou Li
16
6
0
27 Oct 2022
An Overview of Affective Speech Synthesis and Conversion in the Deep
  Learning Era
An Overview of Affective Speech Synthesis and Conversion in the Deep Learning Era
Andreas Triantafyllopoulos
Björn W. Schuller
Gokcce .Iymen
M. Sezgin
Xiangheng He
...
Shuo Liu
Silvan Mertes
Elisabeth André
Ruibo Fu
Jianhua Tao
15
53
0
06 Oct 2022
Controllable Accented Text-to-Speech Synthesis
Controllable Accented Text-to-Speech Synthesis
Rui Liu
Berrak Sisman
Guanglai Gao
Haizhou Li
21
6
0
22 Sep 2022
Automatic Prosody Annotation with Pre-Trained Text-Speech Model
Automatic Prosody Annotation with Pre-Trained Text-Speech Model
Ziqian Dai
Jianwei Yu
Yan Wang
Nuo Chen
Yanyao Bian
Guangzhi Li
Deng Cai
Dong Yu
108
7
0
16 Jun 2022
NatiQ: An End-to-end Text-to-Speech System for Arabic
NatiQ: An End-to-end Text-to-Speech System for Arabic
Ahmed Abdelali
Nadir Durrani
C. Demiroğlu
Fahim Dalvi
Hamdy Mubarak
Kareem Darwish
13
14
0
15 Jun 2022
Accurate Emotion Strength Assessment for Seen and Unseen Speech Based on
  Data-Driven Deep Learning
Accurate Emotion Strength Assessment for Seen and Unseen Speech Based on Data-Driven Deep Learning
Rui Liu
Berrak Sisman
Björn Schuller
Guanglai Gao
Haizhou Li
19
11
0
15 Jun 2022
StyleTTS: A Style-Based Generative Model for Natural and Diverse
  Text-to-Speech Synthesis
StyleTTS: A Style-Based Generative Model for Natural and Diverse Text-to-Speech Synthesis
Yinghao Aaron Li
Cong Han
N. Mesgarani
33
38
0
30 May 2022
Read the Room: Adapting a Robot's Voice to Ambient and Social Contexts
Read the Room: Adapting a Robot's Voice to Ambient and Social Contexts
Paige Tuttosi
Emma Hughson
Akihiro Matsufuji
Angelica Lim
20
4
0
10 May 2022
Disentangling Style and Speaker Attributes for TTS Style Transfer
Disentangling Style and Speaker Attributes for TTS Style Transfer
Xiaochun An
Frank Soong
Lei Xie
54
18
0
24 Jan 2022
Emotion Intensity and its Control for Emotional Voice Conversion
Emotion Intensity and its Control for Emotional Voice Conversion
Kun Zhou
Berrak Sisman
R. Rana
Björn W. Schuller
Haizhou Li
52
54
0
10 Jan 2022
Multi-speaker Multi-style Text-to-speech Synthesis With Single-speaker
  Single-style Training Data Scenarios
Multi-speaker Multi-style Text-to-speech Synthesis With Single-speaker Single-style Training Data Scenarios
Qicong Xie
Tao Li
Xinsheng Wang
Zhichao Wang
Lei Xie
Guoqiao Yu
Guanglu Wan
11
11
0
23 Dec 2021
VisualTTS: TTS with Accurate Lip-Speech Synchronization for Automatic
  Voice Over
VisualTTS: TTS with Accurate Lip-Speech Synchronization for Automatic Voice Over
Junchen Lu
Berrak Sisman
Rui Liu
Mingyang Zhang
Haizhou Li
DiffM
32
19
0
07 Oct 2021
StrengthNet: Deep Learning-based Emotion Strength Assessment for
  Emotional Speech Synthesis
StrengthNet: Deep Learning-based Emotion Strength Assessment for Emotional Speech Synthesis
Rui Liu
Berrak Sisman
Haizhou Li
19
2
0
07 Oct 2021
Expressive Voice Conversion: A Joint Framework for Speaker Identity and
  Emotional Style Transfer
Expressive Voice Conversion: A Joint Framework for Speaker Identity and Emotional Style Transfer
Zongyang Du
Berrak Sisman
Kun Zhou
Haizhou Li
16
20
0
08 Jul 2021
A Survey on Neural Speech Synthesis
A Survey on Neural Speech Synthesis
Xu Tan
Tao Qin
Frank Soong
Tie-Yan Liu
AI4TS
18
352
0
29 Jun 2021
Improving Performance of Seen and Unseen Speech Style Transfer in
  End-to-end Neural TTS
Improving Performance of Seen and Unseen Speech Style Transfer in End-to-end Neural TTS
Xiaochun An
Frank Soong
Lei Xie
26
9
0
18 Jun 2021
Improving multi-speaker TTS prosody variance with a residual encoder and
  normalizing flows
Improving multi-speaker TTS prosody variance with a residual encoder and normalizing flows
Iván Vallés-Pérez
Julian Roth
Grzegorz Beringer
Roberto Barra-Chicote
J. Droppo
21
8
0
10 Jun 2021
Emotional Voice Conversion: Theory, Databases and ESD
Emotional Voice Conversion: Theory, Databases and ESD
Kun Zhou
Berrak Sisman
Rui Liu
Haizhou Li
23
167
0
31 May 2021
Reinforcement Learning for Emotional Text-to-Speech Synthesis with
  Improved Emotion Discriminability
Reinforcement Learning for Emotional Text-to-Speech Synthesis with Improved Emotion Discriminability
Rui Liu
Berrak Sisman
Haizhou Li
21
32
0
03 Apr 2021
Limited Data Emotional Voice Conversion Leveraging Text-to-Speech:
  Two-stage Sequence-to-Sequence Training
Limited Data Emotional Voice Conversion Leveraging Text-to-Speech: Two-stage Sequence-to-Sequence Training
Kun Zhou
Berrak Sisman
Haizhou Li
10
27
0
31 Mar 2021
Adversarially learning disentangled speech representations for robust
  multi-factor voice conversion
Adversarially learning disentangled speech representations for robust multi-factor voice conversion
Jie Wang
Jingbei Li
Xintao Zhao
Zhiyong Wu
Shiyin Kang
H. Meng
DRL
29
29
0
30 Jan 2021
Fine-grained Style Modeling, Transfer and Prediction in Text-to-Speech
  Synthesis via Phone-Level Content-Style Disentanglement
Fine-grained Style Modeling, Transfer and Prediction in Text-to-Speech Synthesis via Phone-Level Content-Style Disentanglement
Daxin Tan
Tan Lee
11
21
0
08 Nov 2020
VAW-GAN for Disentanglement and Recomposition of Emotional Elements in
  Speech
VAW-GAN for Disentanglement and Recomposition of Emotional Elements in Speech
Kun Zhou
Berrak Sisman
Haizhou Li
DRL
11
40
0
03 Nov 2020
Learning to Maximize Speech Quality Directly Using MOS Prediction for
  Neural Text-to-Speech
Learning to Maximize Speech Quality Directly Using MOS Prediction for Neural Text-to-Speech
Yeunju Choi
Youngmoon Jung
Youngjoo Suh
Hoirin Kim
6
6
0
02 Nov 2020
Seen and Unseen emotional style transfer for voice conversion with a new
  emotional speech dataset
Seen and Unseen emotional style transfer for voice conversion with a new emotional speech dataset
Kun Zhou
Berrak Sisman
Rui Liu
Haizhou Li
10
185
0
28 Oct 2020
GraphSpeech: Syntax-Aware Graph Attention Network For Neural Speech
  Synthesis
GraphSpeech: Syntax-Aware Graph Attention Network For Neural Speech Synthesis
Rui Liu
Berrak Sisman
Haizhou Li
18
24
0
23 Oct 2020
An Overview of Voice Conversion and its Challenges: From Statistical
  Modeling to Deep Learning
An Overview of Voice Conversion and its Challenges: From Statistical Modeling to Deep Learning
Berrak Sisman
Junichi Yamagishi
Simon King
Haizhou Li
BDL
27
316
0
09 Aug 2020
1