Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1905.07195
Cited By
CHiVE: Varying Prosody in Speech Synthesis with a Linguistically Driven Dynamic Hierarchical Conditional Variational Network
17 May 2019
V. Wan
Chun-an Chan
Tom Kenter
Jakub Vít
R. Clark
Re-assign community
ArXiv
PDF
HTML
Papers citing
"CHiVE: Varying Prosody in Speech Synthesis with a Linguistically Driven Dynamic Hierarchical Conditional Variational Network"
43 / 43 papers shown
Title
PRESENT: Zero-Shot Text-to-Prosody Control
Perry Lam
Huayun Zhang
Nancy F. Chen
Berrak Sisman
Dorien Herremans
43
0
0
13 Aug 2024
A Human-in-the-Loop Approach to Improving Cross-Text Prosody Transfer
Himanshu Maurya
A. Sigurgeirsson
25
0
0
06 Jun 2024
DPP-TTS: Diversifying prosodic features of speech via determinantal point processes
Seongho Joo
Hyukhun Koh
Kyomin Jung
DiffM
39
0
0
23 Oct 2023
PMVC: Data Augmentation-Based Prosody Modeling for Expressive Voice Conversion
Yimin Deng
Huaizhen Tang
Xulong Zhang
Jianzong Wang
Ning Cheng
Jing Xiao
13
7
0
21 Aug 2023
Comparing normalizing flows and diffusion models for prosody and acoustic modelling in text-to-speech
Guangyan Zhang
Thomas Merritt
M. Ribeiro
Biel Tura Vecino
K. Yanagisawa
...
Ammar Abbas
Piotr Bilinski
Roberto Barra-Chicote
Daniel Korzekwa
Jaime Lorenzo-Trueba
DiffM
31
3
0
31 Jul 2023
The Ethical Implications of Generative Audio Models: A Systematic Literature Review
J. Barnett
16
25
0
07 Jul 2023
Improving Prosody for Cross-Speaker Style Transfer by Semi-Supervised Style Extractor and Hierarchical Modeling in Speech Synthesis
Chunyu Qiang
Peng Yang
Hao Che
Ying Zhang
Xiaorui Wang
Zhong-ming Wang
38
9
0
14 Mar 2023
Emotion Selectable End-to-End Text-based Speech Editing
Tao Wang
Jiangyan Yi
Ruibo Fu
J. Tao
Zhengqi Wen
Chu Yuan Zhang
30
2
0
20 Dec 2022
Style-Label-Free: Cross-Speaker Style Transfer by Quantized VAE and Speaker-wise Normalization in Speech Synthesis
Chunyu Qiang
Peng Yang
Hao Che
Xiaorui Wang
Zhongyuan Wang
BDL
21
6
0
13 Dec 2022
Controllable speech synthesis by learning discrete phoneme-level prosodic representations
Nikolaos Ellinas
Myrsini Christidou
Alexandra Vioni
June Sig Sung
Aimilios Chalamandaris
Pirros Tsiakoulis
P. Mastorocostas
17
7
0
29 Nov 2022
Speech Synthesis with Mixed Emotions
Kun Zhou
Berrak Sisman
R. Rana
B.W.Schuller
Haizhou Li
14
43
0
11 Aug 2022
Multi-Horizon Representations with Hierarchical Forward Models for Reinforcement Learning
Trevor A. McInroe
Lukas Schafer
Stefano V. Albrecht
20
4
0
22 Jun 2022
MuSE-SVS: Multi-Singer Emotional Singing Voice Synthesizer that Controls Emotional Intensity
Sungjae Kim
Y.E. Kim
Jewoo Jun
Injung Kim
29
13
0
02 Mar 2022
MsEmoTTS: Multi-scale emotion transfer, prediction, and control for emotional speech synthesis
Yinjiao Lei
Shan Yang
Xinsheng Wang
Lei Xie
17
72
0
17 Jan 2022
Emotion Intensity and its Control for Emotional Voice Conversion
Kun Zhou
Berrak Sisman
R. Rana
Björn W. Schuller
Haizhou Li
52
54
0
10 Jan 2022
Prosodic Clustering for Phoneme-level Prosody Control in End-to-End Speech Synthesis
Alexandra Vioni
Myrsini Christidou
Nikolaos Ellinas
G. Vamvoukakis
Panos Kakoulidis
Taehoon Kim
June Sig Sung
Hyoungmin Park
Aimilios Chalamandaris
Pirros Tsiakoulis
8
11
0
19 Nov 2021
Word-Level Style Control for Expressive, Non-attentive Speech Synthesis
Konstantinos Klapsas
Nikolaos Ellinas
June Sig Sung
Hyoungmin Park
S. Raptis
22
9
0
19 Nov 2021
Improved Prosodic Clustering for Multispeaker and Speaker-independent Phoneme-level Prosody Control
Myrsini Christidou
Alexandra Vioni
Nikolaos Ellinas
G. Vamvoukakis
K. Markopoulos
Panos Kakoulidis
June Sig Sung
Hyoungmin Park
Aimilios Chalamandaris
Pirros Tsiakoulis
14
4
0
19 Nov 2021
Hierarchical prosody modeling and control in non-autoregressive parallel neural TTS
T. Raitio
Jiangchuan Li
Shreyas Seshadri
32
22
0
06 Oct 2021
Applying the Information Bottleneck Principle to Prosodic Representation Learning
Guangyan Zhang
Ying Qin
Daxin Tan
Tan Lee
22
4
0
05 Aug 2021
Learning De-identified Representations of Prosody from Raw Audio
J. Weston
R. Lenain
U. Meepegama
E. Fristed
SSL
24
15
0
17 Jul 2021
Multi-Scale Spectrogram Modelling for Neural Text-to-Speech
Ammar Abbas
Bajibabu Bollepalli
Alexis Moinet
Arnaud Joly
Penny Karanasou
Peter Makarov
Simon Slangens
S. Karlapati
Thomas Drugman
16
0
0
29 Jun 2021
A Survey on Neural Speech Synthesis
Xu Tan
Tao Qin
Frank Soong
Tie-Yan Liu
AI4TS
18
352
0
29 Jun 2021
UniTTS: Residual Learning of Unified Embedding Space for Speech Style Control
M. Kang
Sungjae Kim
Injung Kim
23
3
0
21 Jun 2021
Global Rhythm Style Transfer Without Text Transcriptions
Kaizhi Qian
Yang Zhang
Shiyu Chang
Jinjun Xiong
Chuang Gan
David D. Cox
M. Hasegawa-Johnson
26
20
0
16 Jun 2021
Ctrl-P: Temporal Control of Prosodic Variation for Speech Synthesis
D. Mohan
Qinmin Hu
Tian Huey Teh
Alexandra Torresquintero
C. Wallis
Marlene Staib
Lorenzo Foglianti
Jiameng Gao
Simon King
20
16
0
15 Jun 2021
Review of end-to-end speech synthesis technology based on deep learning
Zhaoxi Mu
Xinyu Yang
Yizhuo Dong
AuLLM
ALM
18
24
0
20 Apr 2021
Clockwork Variational Autoencoders
Vaibhav Saxena
Jimmy Ba
Danijar Hafner
VGen
DRL
21
49
0
18 Feb 2021
Hierarchical Prosody Modeling for Non-Autoregressive Speech Synthesis
C. Chien
Hung-yi Lee
19
36
0
12 Nov 2020
Fine-grained Style Modeling, Transfer and Prediction in Text-to-Speech Synthesis via Phone-Level Content-Style Disentanglement
Daxin Tan
Tan Lee
19
21
0
08 Nov 2020
Non-Attentive Tacotron: Robust and Controllable Neural TTS Synthesis Including Unsupervised Duration Modeling
Jonathan Shen
Ye Jia
Mike Chrzanowski
Yu Zhang
Isaac Elias
Heiga Zen
Yonghui Wu
14
112
0
08 Oct 2020
Controllable neural text-to-speech synthesis using intuitive prosodic features
T. Raitio
Ramya Rasipuram
D. Castellani
24
66
0
14 Sep 2020
RECOApy: Data recording, pre-processing and phonetic transcription for end-to-end speech-based applications
Adriana Stan
21
5
0
11 Sep 2020
Expressive TTS Training with Frame and Style Reconstruction Loss
Rui Liu
Berrak Sisman
Guanglai Gao
Haizhou Li
22
73
0
04 Aug 2020
Pitchtron: Towards audiobook generation from ordinary people's voices
Sunghee Jung
Hoi-Rim Kim
6
5
0
21 May 2020
Semi-supervised Learning for Multi-speaker Text-to-speech Synthesis Using Discrete Speech Representation
Tao Tu
Yuan-Jui Chen
Alexander H. Liu
Hung-yi Lee
17
7
0
16 May 2020
Unsupervised Speech Decomposition via Triple Information Bottleneck
Kaizhi Qian
Yang Zhang
Shiyu Chang
David D. Cox
M. Hasegawa-Johnson
4
177
0
23 Apr 2020
Perception of prosodic variation for speech synthesis using an unsupervised discrete representation of F0
Zack Hodari
Catherine Lai
Simon King
6
13
0
14 Mar 2020
Prosody Transfer in Neural Text to Speech Using Global Pitch and Loudness Features
Siddharth Gururani
Kilol Gupta
D. Shah
Z. Shakeri
Jervis Pinto
4
15
0
21 Nov 2019
Improving Universal Sound Separation Using Sound Classification
Efthymios Tzinis
Scott Wisdom
J. Hershey
A. Jansen
D. Ellis
VLM
21
73
0
18 Nov 2019
Semi-Supervised Generative Modeling for Controllable Speech Synthesis
Raza Habib
Soroosh Mariooryad
Matt Shannon
Eric Battenberg
RJ Skerry-Ryan
Daisy Stanton
David Kao
Tom Bagby
BDL
9
48
0
03 Oct 2019
Evaluating Long-form Text-to-Speech: Comparing the Ratings of Sentences and Paragraphs
R. Clark
Hanna Silén
Tom Kenter
Ralph Leith
ELM
10
44
0
09 Sep 2019
Using generative modelling to produce varied intonation for speech synthesis
Zack Hodari
O. Watts
Simon King
21
29
0
10 Jun 2019
1