Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis

23 March 2018

Yuxuan Wang

Rif A. Saurous

Papers citing "Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis"

25 / 275 papers shown

Title
End-to-end Text-to-speech for Low-resource Languages by Cross-Lingual Transfer Learning Tao Tu Yuan-Jui Chen Cheng-chieh Yeh Hung-yi Lee 93 88 0 13 Apr 2019
Direct speech-to-speech translation with a sequence-to-sequence model Ye Jia Ron J. Weiss Fadi Biadsy Wolfgang Macherey Melvin Johnson Zhiwen Chen Yonghui Wu 101 230 0 12 Apr 2019
Exploiting Syntactic Features in a Parsed Tree to Improve End-to-End TTS Haohan Guo Frank Soong Lei He Lei Xie 71 30 0 09 Apr 2019
LibriTTS: A Corpus Derived from LibriSpeech for Text-to-Speech Heiga Zen Viet Dang R. Clark Yu Zhang Ron J. Weiss Ye Jia Zhiwen Chen Yonghui Wu 164 959 0 05 Apr 2019
In Other News: A Bi-style Text-to-speech Model for Synthesizing Newscaster Voice with Limited Data N. Prateek Mateusz Lajszczak Roberto Barra-Chicote Thomas Drugman Jaime Lorenzo-Trueba Thomas Merritt S. Ronanki Trevor Wood 87 30 0 04 Apr 2019
Multi-reference Tacotron by Intercross Training for Style Disentangling,Transfer and Control in Speech Synthesis Yanyao Bian Changbin Chen Yongguo Kang Zhenglin Pan 77 46 0 04 Apr 2019
Visualization and Interpretation of Latent Spaces for Controlling Expressive Speech Synthesis through Audio Analysis Noé Tits Fengna Wang Kevin El Haddad Vincent Pagel Thierry Dutoit DiffM 88 39 0 27 Mar 2019
Modeling Multi-speaker Latent Space to Improve Neural TTS: Quick Enrolling New Speaker and Enhancing Premium Voice Yan Deng Lei He Frank Soong 114 29 0 13 Dec 2018
Learning latent representations for style control and transfer in end-to-end speech synthesis Ya-Jie Zhang Shifeng Pan Lei He Zhenhua Ling BDL SSL DRL 97 229 0 11 Dec 2018
Robust and fine-grained prosody control of end-to-end speech synthesis Younggun Lee Jonathan Le Roux 88 147 0 06 Nov 2018
Leveraging Weakly Supervised Data to Improve End-to-End Speech-to-Text Translation Ye Jia Melvin Johnson Wolfgang Macherey Ron J. Weiss Yuan Cao Chung-Cheng Chiu Naveen Ari Stella Laurenzo Yonghui Wu 98 163 0 05 Nov 2018
Investigating context features hidden in End-to-End TTS Kohki Mametani T. Kato Seiichi Yamamoto 46 9 0 04 Nov 2018
Training Neural Speech Recognition Systems with Synthetic Speech Augmentation Jason Chun Lok Li R. Gadde Boris Ginsburg Vitaly Lavrukhin 63 55 0 02 Nov 2018
Hierarchical Generative Modeling for Controllable Speech Synthesis Wei-Ning Hsu Yu Zhang Ron J. Weiss Heiga Zen Yonghui Wu ... Ye Jia Zhiwen Chen Jonathan Shen Patrick Nguyen Ruoming Pang BDL 93 276 0 16 Oct 2018
A Fully Time-domain Neural Model for Subband-based Speech Synthesizer Azam Rabiee Geonmin Kim Tae-Ho Kim Soo-Young Lee 18 1 0 12 Oct 2018
Semi-Supervised Training for Improving Data Efficiency in End-to-End Speech Synthesis Yu-An Chung Yuxuan Wang Wei-Ning Hsu Yu Zhang RJ Skerry-Ryan 87 117 0 30 Aug 2018
Predicting Expressive Speaking Style From Text In End-To-End Speech Synthesis Daisy Stanton Yuxuan Wang RJ Skerry-Ryan 73 122 0 04 Aug 2018
Wasserstein GAN and Waveform Loss-based Acoustic Model Training for Multi-speaker Text-to-Speech Synthesis Systems Using a WaveNet Vocoder Yi Zhao Shinji Takaki Hieu-Thi Luong Junichi Yamagishi Daisuke Saito Nobuaki Minematsu 92 64 0 31 Jul 2018
Scaling and bias codes for modeling speaker-adaptive DNN-based speech synthesis systems Hieu-Thi Luong Junichi Yamagishi 117 7 0 31 Jul 2018
Deep Encoder-Decoder Models for Unsupervised Learning of Controllable Speech Synthesis G. Henter Jaime Lorenzo-Trueba Xin Wang Junichi Yamagishi DRL SSL 88 61 0 30 Jul 2018
Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis Ye Jia Yu Zhang Ron J. Weiss Quan Wang Jonathan Shen ... Zhiwen Chen Patrick Nguyen Ruoming Pang Ignacio López Moreno Yonghui Wu 270 838 0 12 Jun 2018
Disentangling by Partitioning: A Representation Learning Framework for Multimodal Sensory Data Wei-Ning Hsu James R. Glass DRL 79 43 0 29 May 2018
Automatic Documentation of ICD Codes with Far-Field Speech Recognition Albert Haque Corinna Fukushima 23 0 0 30 Apr 2018
Conditional End-to-End Audio Transforms Albert Haque Michelle Guo Prateek Verma 114 41 0 30 Mar 2018
Towards End-to-End Prosody Transfer for Expressive Speech Synthesis with Tacotron RJ Skerry-Ryan Eric Battenberg Y. Xiao Yuxuan Wang Daisy Stanton Joel Shor Ron J. Weiss R. Clark Rif A. Saurous 56 555 0 24 Mar 2018