VARA-TTS: Non-Autoregressive Text-to-Speech Synthesis based on Very Deep
VAE with Residual Attention

VARA-TTS: Non-Autoregressive Text-to-Speech Synthesis based on Very Deep VAE with Residual Attention

12 February 2021

Papers citing "VARA-TTS: Non-Autoregressive Text-to-Speech Synthesis based on Very Deep VAE with Residual Attention"

15 / 15 papers shown

Title
E1 TTS: Simple and Fast Non-Autoregressive TTS Zhijun Liu Shuai Wang Pengcheng Zhu Mengxiao Bi Haizhou Li VLM DiffM 38 3 0 14 Sep 2024
MeLFusion: Synthesizing Music from Image and Language Cues using Diffusion Models Sanjoy Chowdhury Sayan Nag K. J. Joseph Balaji Vasan Srinivasan Dinesh Manocha DiffM 41 7 0 07 Jun 2024
Style-Label-Free: Cross-Speaker Style Transfer by Quantized VAE and Speaker-wise Normalization in Speech Synthesis Chunyu Qiang Peng Yang Hao Che Xiaorui Wang Zhongyuan Wang BDL 16 6 0 13 Dec 2022
UniSyn: An End-to-End Unified Model for Text-to-Speech and Singing Voice Synthesis Yinjiao Lei Shan Yang Xinsheng Wang Qicong Xie Jixun Yao Linfu Xie Dan Su DiffM 13 8 0 03 Dec 2022
Back-Translation-Style Data Augmentation for Mandarin Chinese Polyphone Disambiguation Chunyu Qiang Peng Yang Hao Che Jinba Xiao Xiaorui Wang Zhongyuan Wang 9 3 0 17 Nov 2022
Towards zero-shot Text-based voice editing using acoustic context conditioning, utterance embeddings, and reference encoders Jason Fong Yun Wang Prabhav Agrawal Vimal Manohar Jilong Wu Thilo Kohler Qing He 11 0 0 28 Oct 2022
Features Fusion Framework for Multimodal Irregular Time-series Events Peiwang Tang Xianchao Zhang AI4TS 18 2 0 05 Sep 2022
A Survey on Non-Autoregressive Generation for Neural Machine Translation and Beyond Yisheng Xiao Lijun Wu Junliang Guo Juntao Li M. Zhang Tao Qin Tie-Yan Liu 3DV MedIm AI4CE 25 82 0 20 Apr 2022
DiffGAN-TTS: High-Fidelity and Efficient Text-to-Speech with Denoising Diffusion GANs Songxiang Liu Dan Su Dong Yu DiffM 68 65 0 28 Jan 2022
Conditional Deep Hierarchical Variational Autoencoder for Voice Conversion K. Akuzawa Kotaro Onishi Keisuke Takiguchi Kohki Mametani K. Mori BDL DRL 19 6 0 06 Dec 2021
Guided-TTS: A Diffusion Model for Text-to-Speech via Classifier Guidance Heeseung Kim Sungwon Kim Sungroh Yoon DiffM BDL 13 107 0 23 Nov 2021
Improving Emotional Speech Synthesis by Using SUS-Constrained VAE and Text Encoder Aggregation Fengyu Yang Jian Luan Yujun Wang 25 5 0 19 Oct 2021
A Survey on Neural Speech Synthesis Xu Tan Tao Qin Frank Soong Tie-Yan Liu AI4TS 18 351 0 29 Jun 2021
UniTTS: Residual Learning of Unified Embedding Space for Speech Style Control M. Kang Sungjae Kim Injung Kim 21 3 0 21 Jun 2021
ItôTTS and ItôWave: Linear Stochastic Differential Equation Is All You Need For Audio Generation Shoule Wu Ziqiang Shi DiffM 6 11 0 17 May 2021