Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1810.07217
Cited By
v1
v2 (latest)
Hierarchical Generative Modeling for Controllable Speech Synthesis
16 October 2018
Wei-Ning Hsu
Yu Zhang
Ron J. Weiss
Heiga Zen
Yonghui Wu
Yuxuan Wang
Yuan Cao
Ye Jia
Zhiwen Chen
Jonathan Shen
Patrick Nguyen
Ruoming Pang
BDL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Hierarchical Generative Modeling for Controllable Speech Synthesis"
28 / 178 papers shown
Title
Deep Representation Learning in Speech Processing: Challenges, Recent Advances, and Future Trends
S. Latif
R. Rana
Sara Khalifa
Raja Jurdak
Junaid Qadir
Björn W. Schuller
AI4TS
96
82
0
02 Jan 2020
Singing Voice Conversion with Disentangled Representations of Singer and Vocal Technique Using Variational Autoencoders
Yin-Jyun Luo
Chin-Chen Hsu
Kat R. Agres
Dorien Herremans
DRL
99
47
0
03 Dec 2019
Dynamic Prosody Generation for Speech Synthesis using Linguistics-Driven Acoustic Embedding Selection
Shubhi Tyagi
M. Nicolis
Jonas Rohnke
Thomas Drugman
Jaime Lorenzo-Trueba
77
32
0
02 Dec 2019
Using VAEs and Normalizing Flows for One-shot Text-To-Speech Synthesis of Expressive Speech
Vatsal Aggarwal
Marius Cotescu
N. Prateek
Jaime Lorenzo-Trueba
Roberto Barra-Chicote
93
31
0
28 Nov 2019
Prosody Transfer in Neural Text to Speech Using Global Pitch and Loudness Features
Siddharth Gururani
Kilol Gupta
D. Shah
Z. Shakeri
Jervis Pinto
68
15
0
21 Nov 2019
ESPnet-TTS: Unified, Reproducible, and Integratable Open Source End-to-End Text-to-Speech Toolkit
Tomoki Hayashi
Ryuichi Yamamoto
Katsuki Inoue
Takenori Yoshimura
Shinji Watanabe
Tomoki Toda
K. Takeda
Yu Zhang
Xu Tan
VLM
93
205
0
24 Oct 2019
Semi-Supervised Generative Modeling for Controllable Speech Synthesis
Raza Habib
Soroosh Mariooryad
Matt Shannon
Eric Battenberg
RJ Skerry-Ryan
Daisy Stanton
David Kao
Tom Bagby
BDL
68
48
0
03 Oct 2019
Speech Recognition with Augmented Synthesized Speech
Andrew Rosenberg
Yu Zhang
Bhuvana Ramabhadran
Ye Jia
Pedro J. Moreno
Yonghui Wu
Zelin Wu
69
128
0
25 Sep 2019
Sequence to Sequence Neural Speech Synthesis with Prosody Modification Capabilities
Slava Shechtman
A. Sorin
56
33
0
23 Sep 2019
DurIAN: Duration Informed Attention Network For Multimodal Synthesis
Chengzhu Yu
Heng Lu
Na Hu
Meng Yu
Chao Weng
...
Deyi Tuo
Shiyin Kang
Guangzhi Lei
Jane Polak Scowcroft
Dong Yu
CVBM
89
118
0
04 Sep 2019
Learning to Speak Fluently in a Foreign Language: Multilingual Speech Synthesis and Cross-Language Voice Cloning
Yu Zhang
Ron J. Weiss
Heiga Zen
Yonghui Wu
Zhiwen Chen
RJ Skerry-Ryan
Ye Jia
Andrew Rosenberg
Bhuvana Ramabhadran
76
189
0
09 Jul 2019
A Methodology for Controlling the Emotional Expressiveness in Synthetic Speech -- a Deep Learning approach
Noé Tits
40
10
0
05 Jul 2019
Improving Performance of End-to-End ASR on Numeric Sequences
Cal Peyser
Hao Zhang
Tara N. Sainath
Zelin Wu
AI4TS
63
36
0
01 Jul 2019
Learning Disentangled Representations of Timbre and Pitch for Musical Instrument Sounds Using Gaussian Mixture Variational Autoencoders
Yin-Jyun Luo
Kat R. Agres
Dorien Herremans
103
46
0
19 Jun 2019
Using generative modelling to produce varied intonation for speech synthesis
Zack Hodari
O. Watts
Simon King
67
29
0
10 Jun 2019
Effective Use of Variational Embedding Capacity in Expressive End-to-End Speech Synthesis
Eric Battenberg
Soroosh Mariooryad
Daisy Stanton
RJ Skerry-Ryan
Matt Shannon
David Kao
Tom Bagby
BDL
107
45
0
08 Jun 2019
MelNet: A Generative Model for Audio in the Frequency Domain
Sean Vasquez
M. Lewis
DiffM
85
132
0
04 Jun 2019
Non-Autoregressive Neural Text-to-Speech
Kainan Peng
Ming-Yu Liu
Z. Song
Kexin Zhao
101
40
0
21 May 2019
CHiVE: Varying Prosody in Speech Synthesis with a Linguistically Driven Dynamic Hierarchical Conditional Variational Network
V. Wan
Chun-an Chan
Tom Kenter
Jakub Vít
R. Clark
71
75
0
17 May 2019
Direct speech-to-speech translation with a sequence-to-sequence model
Ye Jia
Ron J. Weiss
Fadi Biadsy
Wolfgang Macherey
Melvin Johnson
Zhiwen Chen
Yonghui Wu
101
230
0
12 Apr 2019
LibriTTS: A Corpus Derived from LibriSpeech for Text-to-Speech
Heiga Zen
Viet Dang
R. Clark
Yu Zhang
Ron J. Weiss
Ye Jia
Zhiwen Chen
Yonghui Wu
164
959
0
05 Apr 2019
In Other News: A Bi-style Text-to-speech Model for Synthesizing Newscaster Voice with Limited Data
N. Prateek
Mateusz Lajszczak
Roberto Barra-Chicote
Thomas Drugman
Jaime Lorenzo-Trueba
Thomas Merritt
S. Ronanki
Trevor Wood
87
30
0
04 Apr 2019
Multi-reference Tacotron by Intercross Training for Style Disentangling,Transfer and Control in Speech Synthesis
Yanyao Bian
Changbin Chen
Yongguo Kang
Zhenglin Pan
77
46
0
04 Apr 2019
Visualization and Interpretation of Latent Spaces for Controlling Expressive Speech Synthesis through Audio Analysis
Noé Tits
Fengna Wang
Kevin El Haddad
Vincent Pagel
Thierry Dutoit
DiffM
88
39
0
27 Mar 2019
Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling
Jonathan Shen
Patrick Nguyen
Yonghui Wu
Zhiwen Chen
Mengzhao Chen
...
William Chan
Shubham Toshniwal
Baohua Liao
M. Nirschl
Pat Rondon
VLM
113
211
0
21 Feb 2019
Unsupervised speech representation learning using WaveNet autoencoders
J. Chorowski
Ron J. Weiss
Samy Bengio
Aaron van den Oord
SSL
76
319
0
25 Jan 2019
Leveraging Weakly Supervised Data to Improve End-to-End Speech-to-Text Translation
Ye Jia
Melvin Johnson
Wolfgang Macherey
Ron J. Weiss
Yuan Cao
Chung-Cheng Chiu
Naveen Ari
Stella Laurenzo
Yonghui Wu
98
163
0
05 Nov 2018
A Variational Prosody Model for Mapping the Context-Sensitive Variation of Functional Prosodic Prototypes
B. Gerazov
Gérard Bailly
Omar Mohammed
Yi Xu
Philip N. Garner
63
7
0
22 Jun 2018
Previous
1
2
3
4