Generating diverse and natural text-to-speech samples using a quantized
fine-grained VAE and auto-regressive prosody prior

Generating diverse and natural text-to-speech samples using a quantized fine-grained VAE and auto-regressive prosody prior

6 February 2020

Guangzhi Sun

Andrew Rosenberg

Bhuvana Ramabhadran

Papers citing "Generating diverse and natural text-to-speech samples using a quantized fine-grained VAE and auto-regressive prosody prior"

10 / 60 papers shown

Title
End-to-End Text-to-Speech using Latent Duration based on VQ-VAE Yusuke Yasuda Xin Wang Junichi Yamagishi 13 16 0 19 Oct 2020
Hierarchical Multi-Grained Generative Model for Expressive Speech Synthesis Yukiya Hono Kazuna Tsuboi Kei Sawada Kei Hashimoto Keiichiro Oura Yoshihiko Nankaku K. Tokuda BDL 11 24 0 17 Sep 2020
Prosody Learning Mechanism for Speech Synthesis System Without Text Length Limit Zhen Zeng Jianzong Wang Ning Cheng Jing Xiao 11 8 0 13 Aug 2020
Modeling Prosodic Phrasing with Multi-Task Learning in Tacotron-based TTS Rui Liu Berrak Sisman F. Bao Guanglai Gao Haizhou Li 9 17 0 11 Aug 2020
Expressive TTS Training with Frame and Style Reconstruction Loss Rui Liu Berrak Sisman Guanglai Gao Haizhou Li 24 73 0 04 Aug 2020
Exploiting Deep Sentential Context for Expressive End-to-End Speech Synthesis Fengyu Yang Shan Yang Qinghua Wu Yujun Wang Lei Xie 11 5 0 03 Aug 2020
Pitchtron: Towards audiobook generation from ordinary people's voices Sunghee Jung Hoi-Rim Kim 11 5 0 21 May 2020
Attentron: Few-Shot Text-to-Speech Utilizing Attention-Based Variable-Length Embedding Seungwoo Choi Seungju Han Dongyoung Kim S. Ha 24 65 0 18 May 2020
Semi-supervised Learning for Multi-speaker Text-to-speech Synthesis Using Discrete Speech Representation Tao Tu Yuan-Jui Chen Alexander H. Liu Hung-yi Lee 25 7 0 16 May 2020
You Do Not Need More Data: Improving End-To-End Speech Recognition by Text-To-Speech Data Augmentation A. Laptev Roman Korostik A. Svischev A. Andrusenko Ivan Medennikov S. Rybin 16 61 0 14 May 2020