Autoregressive Diffusion Transformer for Text-to-Speech Synthesis

Autoregressive Diffusion Transformer for Text-to-Speech Synthesis

8 June 2024

Zhijun Liu

Haizhou Li

Papers citing "Autoregressive Diffusion Transformer for Text-to-Speech Synthesis"

14 / 14 papers shown

Title
E1 TTS: Simple and Fast Non-Autoregressive TTS Zhijun Liu Shuai Wang Pengcheng Zhu Mengxiao Bi Haizhou Li VLM DiffM 38 3 0 14 Sep 2024
Score identity Distillation: Exponentially Fast Distillation of Pretrained Diffusion Models for One-Step Generation Mingyuan Zhou Huangjie Zheng Zhendong Wang Mingzhang Yin Hai Huang DiffM 47 50 0 05 Apr 2024
CLaM-TTS: Improving Neural Codec Language Model for Zero-Shot Text-to-Speech Jaehyeon Kim Keon Lee Seungjun Chung Jaewoong Cho 65 39 0 03 Apr 2024
VoiceCraft: Zero-Shot Speech Editing and Text-to-Speech in the Wild Puyuan Peng Po-Yao (Bernie) Huang Daniel Li Abdelrahman Mohamed David F. Harwath 60 57 0 25 Mar 2024
One-step Diffusion with Distribution Matching Distillation Tianwei Yin Michael Gharbi Richard Zhang Eli Shechtman Frédo Durand William T. Freeman Taesung Park DiffM 124 219 0 30 Nov 2023
FluentSpeech: Stutter-Oriented Automatic Speech Editing with Context-Aware Diffusion Models Ziyue Jiang Qiang Yang Jia-li Zuo Zhe Ye Rongjie Huang Yixiang Ren Zhou Zhao DiffM 62 13 0 23 May 2023
HiFi-Codec: Group-residual Vector quantization for High Fidelity Audio Codec Dongchao Yang Songxiang Liu Rongjie Huang Jinchuan Tian Chao Weng Yuexian Zou 138 118 0 04 May 2023
Stochastic Interpolants: A Unifying Framework for Flows and Diffusions M. S. Albergo Nicholas M. Boffi Eric Vanden-Eijnden DiffM 244 261 0 15 Mar 2023
DiffGAN-TTS: High-Fidelity and Efficient Text-to-Speech with Denoising Diffusion GANs Songxiang Liu Dan Su Dong Yu DiffM 68 65 0 28 Jan 2022
EdiTTS: Score-based Editing for Controllable Text-to-Speech Jaesung Tae Hyeongju Kim Taesu Kim DiffM 171 39 0 06 Oct 2021
Generative Spoken Language Modeling from Raw Audio Kushal Lakhotia Evgeny Kharitonov Wei-Ning Hsu Yossi Adi Adam Polyak ... Tu Nguyen Jade Copet Alexei Baevski A. Mohamed Emmanuel Dupoux AuLLM 174 336 0 01 Feb 2021
Knowledge Distillation in Iterative Generative Models for Improved Sampling Speed Eric Luhman Troy Luhman DiffM 184 259 0 07 Jan 2021
Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis Ye Jia Yu Zhang Ron J. Weiss Quan Wang Jonathan Shen ... Z. Chen Patrick Nguyen Ruoming Pang Ignacio López Moreno Yonghui Wu 201 819 0 12 Jun 2018
Categorical Reparameterization with Gumbel-Softmax Eric Jang S. Gu Ben Poole BDL 75 5,274 0 03 Nov 2016