Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1803.09017
Cited By
Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis
23 March 2018
Yuxuan Wang
Daisy Stanton
Yu Zhang
RJ Skerry-Ryan
Eric Battenberg
Joel Shor
Y. Xiao
Fei Ren
Ye Jia
Rif A. Saurous
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis"
25 / 275 papers shown
Title
End-to-end Text-to-speech for Low-resource Languages by Cross-Lingual Transfer Learning
Tao Tu
Yuan-Jui Chen
Cheng-chieh Yeh
Hung-yi Lee
93
88
0
13 Apr 2019
Direct speech-to-speech translation with a sequence-to-sequence model
Ye Jia
Ron J. Weiss
Fadi Biadsy
Wolfgang Macherey
Melvin Johnson
Zhiwen Chen
Yonghui Wu
101
230
0
12 Apr 2019
Exploiting Syntactic Features in a Parsed Tree to Improve End-to-End TTS
Haohan Guo
Frank Soong
Lei He
Lei Xie
71
30
0
09 Apr 2019
LibriTTS: A Corpus Derived from LibriSpeech for Text-to-Speech
Heiga Zen
Viet Dang
R. Clark
Yu Zhang
Ron J. Weiss
Ye Jia
Zhiwen Chen
Yonghui Wu
164
959
0
05 Apr 2019
In Other News: A Bi-style Text-to-speech Model for Synthesizing Newscaster Voice with Limited Data
N. Prateek
Mateusz Lajszczak
Roberto Barra-Chicote
Thomas Drugman
Jaime Lorenzo-Trueba
Thomas Merritt
S. Ronanki
Trevor Wood
87
30
0
04 Apr 2019
Multi-reference Tacotron by Intercross Training for Style Disentangling,Transfer and Control in Speech Synthesis
Yanyao Bian
Changbin Chen
Yongguo Kang
Zhenglin Pan
77
46
0
04 Apr 2019
Visualization and Interpretation of Latent Spaces for Controlling Expressive Speech Synthesis through Audio Analysis
Noé Tits
Fengna Wang
Kevin El Haddad
Vincent Pagel
Thierry Dutoit
DiffM
88
39
0
27 Mar 2019
Modeling Multi-speaker Latent Space to Improve Neural TTS: Quick Enrolling New Speaker and Enhancing Premium Voice
Yan Deng
Lei He
Frank Soong
114
29
0
13 Dec 2018
Learning latent representations for style control and transfer in end-to-end speech synthesis
Ya-Jie Zhang
Shifeng Pan
Lei He
Zhenhua Ling
BDL
SSL
DRL
97
229
0
11 Dec 2018
Robust and fine-grained prosody control of end-to-end speech synthesis
Younggun Lee
Jonathan Le Roux
88
147
0
06 Nov 2018
Leveraging Weakly Supervised Data to Improve End-to-End Speech-to-Text Translation
Ye Jia
Melvin Johnson
Wolfgang Macherey
Ron J. Weiss
Yuan Cao
Chung-Cheng Chiu
Naveen Ari
Stella Laurenzo
Yonghui Wu
98
163
0
05 Nov 2018
Investigating context features hidden in End-to-End TTS
Kohki Mametani
T. Kato
Seiichi Yamamoto
46
9
0
04 Nov 2018
Training Neural Speech Recognition Systems with Synthetic Speech Augmentation
Jason Chun Lok Li
R. Gadde
Boris Ginsburg
Vitaly Lavrukhin
63
55
0
02 Nov 2018
Hierarchical Generative Modeling for Controllable Speech Synthesis
Wei-Ning Hsu
Yu Zhang
Ron J. Weiss
Heiga Zen
Yonghui Wu
...
Ye Jia
Zhiwen Chen
Jonathan Shen
Patrick Nguyen
Ruoming Pang
BDL
93
276
0
16 Oct 2018
A Fully Time-domain Neural Model for Subband-based Speech Synthesizer
Azam Rabiee
Geonmin Kim
Tae-Ho Kim
Soo-Young Lee
18
1
0
12 Oct 2018
Semi-Supervised Training for Improving Data Efficiency in End-to-End Speech Synthesis
Yu-An Chung
Yuxuan Wang
Wei-Ning Hsu
Yu Zhang
RJ Skerry-Ryan
87
117
0
30 Aug 2018
Predicting Expressive Speaking Style From Text In End-To-End Speech Synthesis
Daisy Stanton
Yuxuan Wang
RJ Skerry-Ryan
73
122
0
04 Aug 2018
Wasserstein GAN and Waveform Loss-based Acoustic Model Training for Multi-speaker Text-to-Speech Synthesis Systems Using a WaveNet Vocoder
Yi Zhao
Shinji Takaki
Hieu-Thi Luong
Junichi Yamagishi
Daisuke Saito
Nobuaki Minematsu
92
64
0
31 Jul 2018
Scaling and bias codes for modeling speaker-adaptive DNN-based speech synthesis systems
Hieu-Thi Luong
Junichi Yamagishi
117
7
0
31 Jul 2018
Deep Encoder-Decoder Models for Unsupervised Learning of Controllable Speech Synthesis
G. Henter
Jaime Lorenzo-Trueba
Xin Wang
Junichi Yamagishi
DRL
SSL
88
61
0
30 Jul 2018
Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis
Ye Jia
Yu Zhang
Ron J. Weiss
Quan Wang
Jonathan Shen
...
Zhiwen Chen
Patrick Nguyen
Ruoming Pang
Ignacio López Moreno
Yonghui Wu
270
838
0
12 Jun 2018
Disentangling by Partitioning: A Representation Learning Framework for Multimodal Sensory Data
Wei-Ning Hsu
James R. Glass
DRL
79
43
0
29 May 2018
Automatic Documentation of ICD Codes with Far-Field Speech Recognition
Albert Haque
Corinna Fukushima
23
0
0
30 Apr 2018
Conditional End-to-End Audio Transforms
Albert Haque
Michelle Guo
Prateek Verma
114
41
0
30 Mar 2018
Towards End-to-End Prosody Transfer for Expressive Speech Synthesis with Tacotron
RJ Skerry-Ryan
Eric Battenberg
Y. Xiao
Yuxuan Wang
Daisy Stanton
Joel Shor
Ron J. Weiss
R. Clark
Rif A. Saurous
56
555
0
24 Mar 2018
Previous
1
2
3
4
5
6