ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1905.07195
  4. Cited By
CHiVE: Varying Prosody in Speech Synthesis with a Linguistically Driven
  Dynamic Hierarchical Conditional Variational Network

CHiVE: Varying Prosody in Speech Synthesis with a Linguistically Driven Dynamic Hierarchical Conditional Variational Network

17 May 2019
V. Wan
Chun-an Chan
Tom Kenter
Jakub Vít
R. Clark
ArXivPDFHTML

Papers citing "CHiVE: Varying Prosody in Speech Synthesis with a Linguistically Driven Dynamic Hierarchical Conditional Variational Network"

43 / 43 papers shown
Title
PRESENT: Zero-Shot Text-to-Prosody Control
PRESENT: Zero-Shot Text-to-Prosody Control
Perry Lam
Huayun Zhang
Nancy F. Chen
Berrak Sisman
Dorien Herremans
43
0
0
13 Aug 2024
A Human-in-the-Loop Approach to Improving Cross-Text Prosody Transfer
A Human-in-the-Loop Approach to Improving Cross-Text Prosody Transfer
Himanshu Maurya
A. Sigurgeirsson
25
0
0
06 Jun 2024
DPP-TTS: Diversifying prosodic features of speech via determinantal
  point processes
DPP-TTS: Diversifying prosodic features of speech via determinantal point processes
Seongho Joo
Hyukhun Koh
Kyomin Jung
DiffM
39
0
0
23 Oct 2023
PMVC: Data Augmentation-Based Prosody Modeling for Expressive Voice
  Conversion
PMVC: Data Augmentation-Based Prosody Modeling for Expressive Voice Conversion
Yimin Deng
Huaizhen Tang
Xulong Zhang
Jianzong Wang
Ning Cheng
Jing Xiao
13
7
0
21 Aug 2023
Comparing normalizing flows and diffusion models for prosody and
  acoustic modelling in text-to-speech
Comparing normalizing flows and diffusion models for prosody and acoustic modelling in text-to-speech
Guangyan Zhang
Thomas Merritt
M. Ribeiro
Biel Tura Vecino
K. Yanagisawa
...
Ammar Abbas
Piotr Bilinski
Roberto Barra-Chicote
Daniel Korzekwa
Jaime Lorenzo-Trueba
DiffM
31
3
0
31 Jul 2023
The Ethical Implications of Generative Audio Models: A Systematic
  Literature Review
The Ethical Implications of Generative Audio Models: A Systematic Literature Review
J. Barnett
16
25
0
07 Jul 2023
Improving Prosody for Cross-Speaker Style Transfer by Semi-Supervised
  Style Extractor and Hierarchical Modeling in Speech Synthesis
Improving Prosody for Cross-Speaker Style Transfer by Semi-Supervised Style Extractor and Hierarchical Modeling in Speech Synthesis
Chunyu Qiang
Peng Yang
Hao Che
Ying Zhang
Xiaorui Wang
Zhong-ming Wang
38
9
0
14 Mar 2023
Emotion Selectable End-to-End Text-based Speech Editing
Emotion Selectable End-to-End Text-based Speech Editing
Tao Wang
Jiangyan Yi
Ruibo Fu
J. Tao
Zhengqi Wen
Chu Yuan Zhang
30
2
0
20 Dec 2022
Style-Label-Free: Cross-Speaker Style Transfer by Quantized VAE and
  Speaker-wise Normalization in Speech Synthesis
Style-Label-Free: Cross-Speaker Style Transfer by Quantized VAE and Speaker-wise Normalization in Speech Synthesis
Chunyu Qiang
Peng Yang
Hao Che
Xiaorui Wang
Zhongyuan Wang
BDL
21
6
0
13 Dec 2022
Controllable speech synthesis by learning discrete phoneme-level
  prosodic representations
Controllable speech synthesis by learning discrete phoneme-level prosodic representations
Nikolaos Ellinas
Myrsini Christidou
Alexandra Vioni
June Sig Sung
Aimilios Chalamandaris
Pirros Tsiakoulis
P. Mastorocostas
17
7
0
29 Nov 2022
Speech Synthesis with Mixed Emotions
Speech Synthesis with Mixed Emotions
Kun Zhou
Berrak Sisman
R. Rana
B.W.Schuller
Haizhou Li
14
43
0
11 Aug 2022
Multi-Horizon Representations with Hierarchical Forward Models for
  Reinforcement Learning
Multi-Horizon Representations with Hierarchical Forward Models for Reinforcement Learning
Trevor A. McInroe
Lukas Schafer
Stefano V. Albrecht
20
4
0
22 Jun 2022
MuSE-SVS: Multi-Singer Emotional Singing Voice Synthesizer that Controls
  Emotional Intensity
MuSE-SVS: Multi-Singer Emotional Singing Voice Synthesizer that Controls Emotional Intensity
Sungjae Kim
Y.E. Kim
Jewoo Jun
Injung Kim
29
13
0
02 Mar 2022
MsEmoTTS: Multi-scale emotion transfer, prediction, and control for
  emotional speech synthesis
MsEmoTTS: Multi-scale emotion transfer, prediction, and control for emotional speech synthesis
Yinjiao Lei
Shan Yang
Xinsheng Wang
Lei Xie
17
72
0
17 Jan 2022
Emotion Intensity and its Control for Emotional Voice Conversion
Emotion Intensity and its Control for Emotional Voice Conversion
Kun Zhou
Berrak Sisman
R. Rana
Björn W. Schuller
Haizhou Li
52
54
0
10 Jan 2022
Prosodic Clustering for Phoneme-level Prosody Control in End-to-End
  Speech Synthesis
Prosodic Clustering for Phoneme-level Prosody Control in End-to-End Speech Synthesis
Alexandra Vioni
Myrsini Christidou
Nikolaos Ellinas
G. Vamvoukakis
Panos Kakoulidis
Taehoon Kim
June Sig Sung
Hyoungmin Park
Aimilios Chalamandaris
Pirros Tsiakoulis
8
11
0
19 Nov 2021
Word-Level Style Control for Expressive, Non-attentive Speech Synthesis
Word-Level Style Control for Expressive, Non-attentive Speech Synthesis
Konstantinos Klapsas
Nikolaos Ellinas
June Sig Sung
Hyoungmin Park
S. Raptis
22
9
0
19 Nov 2021
Improved Prosodic Clustering for Multispeaker and Speaker-independent
  Phoneme-level Prosody Control
Improved Prosodic Clustering for Multispeaker and Speaker-independent Phoneme-level Prosody Control
Myrsini Christidou
Alexandra Vioni
Nikolaos Ellinas
G. Vamvoukakis
K. Markopoulos
Panos Kakoulidis
June Sig Sung
Hyoungmin Park
Aimilios Chalamandaris
Pirros Tsiakoulis
14
4
0
19 Nov 2021
Hierarchical prosody modeling and control in non-autoregressive parallel
  neural TTS
Hierarchical prosody modeling and control in non-autoregressive parallel neural TTS
T. Raitio
Jiangchuan Li
Shreyas Seshadri
32
22
0
06 Oct 2021
Applying the Information Bottleneck Principle to Prosodic Representation
  Learning
Applying the Information Bottleneck Principle to Prosodic Representation Learning
Guangyan Zhang
Ying Qin
Daxin Tan
Tan Lee
22
4
0
05 Aug 2021
Learning De-identified Representations of Prosody from Raw Audio
Learning De-identified Representations of Prosody from Raw Audio
J. Weston
R. Lenain
U. Meepegama
E. Fristed
SSL
24
15
0
17 Jul 2021
Multi-Scale Spectrogram Modelling for Neural Text-to-Speech
Multi-Scale Spectrogram Modelling for Neural Text-to-Speech
Ammar Abbas
Bajibabu Bollepalli
Alexis Moinet
Arnaud Joly
Penny Karanasou
Peter Makarov
Simon Slangens
S. Karlapati
Thomas Drugman
16
0
0
29 Jun 2021
A Survey on Neural Speech Synthesis
A Survey on Neural Speech Synthesis
Xu Tan
Tao Qin
Frank Soong
Tie-Yan Liu
AI4TS
18
352
0
29 Jun 2021
UniTTS: Residual Learning of Unified Embedding Space for Speech Style
  Control
UniTTS: Residual Learning of Unified Embedding Space for Speech Style Control
M. Kang
Sungjae Kim
Injung Kim
23
3
0
21 Jun 2021
Global Rhythm Style Transfer Without Text Transcriptions
Global Rhythm Style Transfer Without Text Transcriptions
Kaizhi Qian
Yang Zhang
Shiyu Chang
Jinjun Xiong
Chuang Gan
David D. Cox
M. Hasegawa-Johnson
26
20
0
16 Jun 2021
Ctrl-P: Temporal Control of Prosodic Variation for Speech Synthesis
Ctrl-P: Temporal Control of Prosodic Variation for Speech Synthesis
D. Mohan
Qinmin Hu
Tian Huey Teh
Alexandra Torresquintero
C. Wallis
Marlene Staib
Lorenzo Foglianti
Jiameng Gao
Simon King
20
16
0
15 Jun 2021
Review of end-to-end speech synthesis technology based on deep learning
Review of end-to-end speech synthesis technology based on deep learning
Zhaoxi Mu
Xinyu Yang
Yizhuo Dong
AuLLM
ALM
18
24
0
20 Apr 2021
Clockwork Variational Autoencoders
Clockwork Variational Autoencoders
Vaibhav Saxena
Jimmy Ba
Danijar Hafner
VGen
DRL
21
49
0
18 Feb 2021
Hierarchical Prosody Modeling for Non-Autoregressive Speech Synthesis
Hierarchical Prosody Modeling for Non-Autoregressive Speech Synthesis
C. Chien
Hung-yi Lee
19
36
0
12 Nov 2020
Fine-grained Style Modeling, Transfer and Prediction in Text-to-Speech
  Synthesis via Phone-Level Content-Style Disentanglement
Fine-grained Style Modeling, Transfer and Prediction in Text-to-Speech Synthesis via Phone-Level Content-Style Disentanglement
Daxin Tan
Tan Lee
19
21
0
08 Nov 2020
Non-Attentive Tacotron: Robust and Controllable Neural TTS Synthesis
  Including Unsupervised Duration Modeling
Non-Attentive Tacotron: Robust and Controllable Neural TTS Synthesis Including Unsupervised Duration Modeling
Jonathan Shen
Ye Jia
Mike Chrzanowski
Yu Zhang
Isaac Elias
Heiga Zen
Yonghui Wu
14
112
0
08 Oct 2020
Controllable neural text-to-speech synthesis using intuitive prosodic
  features
Controllable neural text-to-speech synthesis using intuitive prosodic features
T. Raitio
Ramya Rasipuram
D. Castellani
24
66
0
14 Sep 2020
RECOApy: Data recording, pre-processing and phonetic transcription for
  end-to-end speech-based applications
RECOApy: Data recording, pre-processing and phonetic transcription for end-to-end speech-based applications
Adriana Stan
21
5
0
11 Sep 2020
Expressive TTS Training with Frame and Style Reconstruction Loss
Expressive TTS Training with Frame and Style Reconstruction Loss
Rui Liu
Berrak Sisman
Guanglai Gao
Haizhou Li
22
73
0
04 Aug 2020
Pitchtron: Towards audiobook generation from ordinary people's voices
Pitchtron: Towards audiobook generation from ordinary people's voices
Sunghee Jung
Hoi-Rim Kim
6
5
0
21 May 2020
Semi-supervised Learning for Multi-speaker Text-to-speech Synthesis
  Using Discrete Speech Representation
Semi-supervised Learning for Multi-speaker Text-to-speech Synthesis Using Discrete Speech Representation
Tao Tu
Yuan-Jui Chen
Alexander H. Liu
Hung-yi Lee
17
7
0
16 May 2020
Unsupervised Speech Decomposition via Triple Information Bottleneck
Unsupervised Speech Decomposition via Triple Information Bottleneck
Kaizhi Qian
Yang Zhang
Shiyu Chang
David D. Cox
M. Hasegawa-Johnson
4
177
0
23 Apr 2020
Perception of prosodic variation for speech synthesis using an
  unsupervised discrete representation of F0
Perception of prosodic variation for speech synthesis using an unsupervised discrete representation of F0
Zack Hodari
Catherine Lai
Simon King
6
13
0
14 Mar 2020
Prosody Transfer in Neural Text to Speech Using Global Pitch and
  Loudness Features
Prosody Transfer in Neural Text to Speech Using Global Pitch and Loudness Features
Siddharth Gururani
Kilol Gupta
D. Shah
Z. Shakeri
Jervis Pinto
4
15
0
21 Nov 2019
Improving Universal Sound Separation Using Sound Classification
Improving Universal Sound Separation Using Sound Classification
Efthymios Tzinis
Scott Wisdom
J. Hershey
A. Jansen
D. Ellis
VLM
21
73
0
18 Nov 2019
Semi-Supervised Generative Modeling for Controllable Speech Synthesis
Semi-Supervised Generative Modeling for Controllable Speech Synthesis
Raza Habib
Soroosh Mariooryad
Matt Shannon
Eric Battenberg
RJ Skerry-Ryan
Daisy Stanton
David Kao
Tom Bagby
BDL
9
48
0
03 Oct 2019
Evaluating Long-form Text-to-Speech: Comparing the Ratings of Sentences
  and Paragraphs
Evaluating Long-form Text-to-Speech: Comparing the Ratings of Sentences and Paragraphs
R. Clark
Hanna Silén
Tom Kenter
Ralph Leith
ELM
10
44
0
09 Sep 2019
Using generative modelling to produce varied intonation for speech
  synthesis
Using generative modelling to produce varied intonation for speech synthesis
Zack Hodari
O. Watts
Simon King
21
29
0
10 Jun 2019
1