Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2105.06337
Cited By
Grad-TTS: A Diffusion Probabilistic Model for Text-to-Speech
13 May 2021
Vadim Popov
Ivan Vovk
Vladimir Gogoryan
Tasnima Sadekova
Mikhail Kudinov
DiffM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Grad-TTS: A Diffusion Probabilistic Model for Text-to-Speech"
50 / 115 papers shown
Title
ACT-Diffusion: Efficient Adversarial Consistency Training for One-step Diffusion Models
Fei Kong
Jinhao Duan
Lichao Sun
Hao-Ran Cheng
Renjing Xu
Hengtao Shen
Xiao-lan Zhu
Xiaoshuang Shi
Kaidi Xu
38
3
0
23 Nov 2023
Diff-HierVC: Diffusion-based Hierarchical Voice Conversion with Robust Pitch Generation and Masked Prior for Zero-shot Speaker Adaptation
Haram Choi
Sang-Hoon Lee
Seong-Whan Lee
DiffM
21
24
0
08 Nov 2023
Seeing Through the Conversation: Audio-Visual Speech Separation based on Diffusion Model
Suyeon Lee
Chaeyoung Jung
Youngjoon Jang
Jaehun Kim
Joon Son Chung
30
7
0
30 Oct 2023
Cross-Utterance Conditioned VAE for Speech Generation
Y. Li
Cheng Yu
Guangzhi Sun
Weiqin Zu
Zheng Tian
...
Wei Pan
Chao Zhang
Jun Wang
Yang Yang
Fanglei Sun
11
2
0
08 Sep 2023
Highly Controllable Diffusion-based Any-to-Any Voice Conversion Model with Frame-level Prosody Feature
Kyungguen Byun
Sunkuk Moon
Erik Visser
DiffM
24
0
0
06 Sep 2023
Matcha-TTS: A fast TTS architecture with conditional flow matching
Shivam Mehta
Ruibo Tu
Jonas Beskow
Éva Székely
G. Henter
16
69
0
06 Sep 2023
DiCLET-TTS: Diffusion Model based Cross-lingual Emotion Transfer for Text-to-Speech -- A Study between English and Mandarin
Tao Li
Chenxu Hu
Jian Cong
Xinfa Zhu
Jingbei Li
Qiao Tian
Yuping Wang
Linfu Xie
DiffM
27
8
0
02 Sep 2023
LightGrad: Lightweight Diffusion Probabilistic Model for Text-to-Speech
Jing Chen
Xingcheng Song
Zhendong Peng
Binbin Zhang
Fuping Pan
Zhiyong Wu
DiffM
19
16
0
31 Aug 2023
Multi-GradSpeech: Towards Diffusion-based Multi-Speaker Text-to-speech Using Consistent Diffusion Models
Heyang Xue
Shuai Guo
Pengcheng Zhu
Mengxiao Bi
DiffM
35
1
0
21 Aug 2023
AudioLDM 2: Learning Holistic Audio Generation with Self-supervised Pretraining
Haohe Liu
Yiitan Yuan
Xubo Liu
Xinhao Mei
Qiuqiang Kong
Qiao Tian
Yuping Wang
Wenwu Wang
Yuxuan Wang
Mark D. Plumbley
DiffM
25
222
0
10 Aug 2023
Nearly
d
d
d
-Linear Convergence Bounds for Diffusion Models via Stochastic Localization
Joe Benton
Valentin De Bortoli
Arnaud Doucet
George Deligiannidis
DiffM
41
101
0
07 Aug 2023
VITS2: Improving Quality and Efficiency of Single-Stage Text-to-Speech with Adversarial Learning and Architecture Design
Jungil Kong
Jihoon Park
Beomjeong Kim
Jeongmin Kim
Dohee Kong
Sangjin Kim
29
35
0
31 Jul 2023
Error Bounds for Flow Matching Methods
Joe Benton
George Deligiannidis
Arnaud Doucet
DiffM
28
31
0
26 May 2023
SEEDS: Exponential SDE Solvers for Fast High-Quality Sampling from Diffusion Models
Martin Gonzalez
N. Fernández
T. Tran
Elies Gherbi
H. Hajri
N. Masmoudi
DiffM
30
22
0
23 May 2023
U-DiT TTS: U-Diffusion Vision Transformer for Text-to-Speech
Xin Jing
Yi Chang
Zijiang Yang
Jiang-jian Xie
Andreas Triantafyllopoulos
Bjoern W. Schuller
31
10
0
22 May 2023
ViT-TTS: Visual Text-to-Speech with Scalable Diffusion Transformer
Huadai Liu
Rongjie Huang
Xuan Lin
Wenqiang Xu
Maozong Zheng
Hong Chen
Jinzheng He
Zhou Zhao
DiffM
31
20
0
22 May 2023
CoMoSpeech: One-Step Speech and Singing Voice Synthesis via Consistency Model
Zhe Ye
Wei Xue
Xuejiao Tan
Jie Chen
Qi-fei Liu
Yi-Ting Guo
DiffM
30
40
0
11 May 2023
Learn to Sing by Listening: Building Controllable Virtual Singer by Unsupervised Learning from Voice Recordings
Wei Xue
Yiwen Wang
Qi-fei Liu
Yi-Ting Guo
24
1
0
09 May 2023
Multi-Speaker Multi-Lingual VQTTS System for LIMMITS 2023 Challenge
Chenpeng Du
Yiwei Guo
Feiyu Shen
Kai Yu
19
5
0
25 Apr 2023
DiffTAD: Temporal Action Detection with Proposal Denoising Diffusion
Sauradip Nag
Xiatian Zhu
Jiankang Deng
Yi-Zhe Song
Tao Xiang
DiffM
VGen
41
21
0
27 Mar 2023
DiffusionRet: Generative Text-Video Retrieval with Diffusion Model
Peng Jin
Hao Li
Ze-Long Cheng
Kehan Li
Xiang Ji
Chang-rui Liu
Li-ming Yuan
Jie Chen
DiffM
VGen
26
53
0
17 Mar 2023
Unifying Layout Generation with a Decoupled Diffusion Model
Mude Hui
Zhizheng Zhang
Xiaoyi Zhang
Wenxuan Xie
Yuwang Wang
Yan Lu
DiffM
13
39
0
09 Mar 2023
Varianceflow: High-Quality and Controllable Text-to-Speech using Variance Information via Normalizing Flow
Yoonhyung Lee
Jinhyeok Yang
Kyomin Jung
20
6
0
27 Feb 2023
Star-Shaped Denoising Diffusion Probabilistic Models
Andrey Okhotin
Dmitry Molchanov
V. Arkhipkin
Grigory Bartosh
Viktor Ohanesian
Aibek Alanov
Dmitry Vetrov
DiffM
32
12
0
10 Feb 2023
HumanMAC: Masked Motion Completion for Human Motion Prediction
Ling-Hao Chen
Jiawei Zhang
Ye-rong Li
Yiren Pang
Xiaobo Xia
Tongliang Liu
DiffM
VGen
32
56
0
07 Feb 2023
InstructTTS: Modelling Expressive TTS in Discrete Latent Space with Natural Language Style Prompt
Dongchao Yang
Songxiang Liu
Rongjie Huang
Chao Weng
H. Meng
DiffM
VLM
31
85
0
31 Jan 2023
Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers
Chengyi Wang
Sanyuan Chen
Yu-Huan Wu
Zi-Hua Zhang
Long Zhou
...
Huaming Wang
Jinyu Li
Lei He
Sheng Zhao
Furu Wei
43
641
0
05 Jan 2023
ResGrad: Residual Denoising Diffusion Probabilistic Models for Text to Speech
Ze Chen
Yihan Wu
Yichong Leng
Jiawei Chen
Haohe Liu
...
Ke Wang
Lei He
Sheng Zhao
Jiang Bian
Danilo P. Mandic
DiffM
27
22
0
30 Dec 2022
MoFusion: A Framework for Denoising-Diffusion-based Motion Synthesis
Rishabh Dabral
Muhammad Hamza Mughal
Vladislav Golyanik
Christian Theobalt
DiffM
VGen
32
171
0
08 Dec 2022
Denoising diffusion probabilistic models for probabilistic energy forecasting
Esteban Hernandez Capel
Jonathan Dumas
DiffM
11
15
0
06 Dec 2022
Fast Sampling of Diffusion Models via Operator Learning
Hongkai Zheng
Weili Nie
Arash Vahdat
Kamyar Azizzadenesheli
Anima Anandkumar
DiffM
54
131
0
24 Nov 2022
DiffusionDet: Diffusion Model for Object Detection
Shoufa Chen
Pei Sun
Yibing Song
Ping Luo
54
442
0
17 Nov 2022
Towards Building Text-To-Speech Systems for the Next Billion Users
Gokul Karthik Kumar
V. PraveenS.
Pratyush Kumar
Mitesh M. Khapra
Karthik Nandakumar
36
18
0
17 Nov 2022
EmoDiff: Intensity Controllable Emotional Text-to-Speech with Soft-Label Guidance
Yiwei Guo
Chenpeng Du
Xie Chen
K. Yu
DiffM
52
39
0
17 Nov 2022
OverFlow: Putting flows on top of neural transducers for better TTS
Shivam Mehta
Ambika Kirkland
Harm Lameris
Jonas Beskow
Éva Székely
G. Henter
AI4TS
34
12
0
13 Nov 2022
Guided Conditional Diffusion for Controllable Traffic Simulation
Ziyuan Zhong
Davis Rempe
Danfei Xu
Yuxiao Chen
Sushant Veer
Tong Che
Baishakhi Ray
Marco Pavone
21
146
0
31 Oct 2022
Imagic: Text-Based Real Image Editing with Diffusion Models
Bahjat Kawar
Shiran Zada
Oran Lang
Omer Tov
Hui-Tang Chang
Tali Dekel
Inbar Mosseri
Michal Irani
11
1,050
0
17 Oct 2022
GENIE: Higher-Order Denoising Diffusion Solvers
Tim Dockhorn
Arash Vahdat
Karsten Kreis
DiffM
49
104
0
11 Oct 2022
f-DM: A Multi-stage Diffusion Model via Progressive Signal Transformation
Jiatao Gu
Shuangfei Zhai
Yizhe Zhang
Miguel Angel Bautista
J. Susskind
DiffM
53
26
0
10 Oct 2022
An Overview of Affective Speech Synthesis and Conversion in the Deep Learning Era
Andreas Triantafyllopoulos
Björn W. Schuller
Gokcce .Iymen
M. Sezgin
Xiangheng He
...
Shuo Liu
Silvan Mertes
Elisabeth André
Ruibo Fu
Jianhua Tao
15
53
0
06 Oct 2022
OCD: Learning to Overfit with Conditional Diffusion Models
Shahar Lutati
Lior Wolf
DiffM
20
8
0
02 Oct 2022
Flow Straight and Fast: Learning to Generate and Transfer Data with Rectified Flow
Xingchao Liu
Chengyue Gong
Qiang Liu
OOD
35
845
0
07 Sep 2022
Diffusion Models: A Comprehensive Survey of Methods and Applications
Ling Yang
Zhilong Zhang
Yingxia Shao
Shenda Hong
Runsheng Xu
Yue Zhao
Wentao Zhang
Bin Cui
Ming-Hsuan Yang
DiffM
MedIm
224
1,302
0
02 Sep 2022
A One-Shot Reparameterization Method for Reducing the Loss of Tile Pruning on DNNs
Yancheng Li
Qingzhong Ai
Fumihiko Ino
25
0
0
29 Jul 2022
ProDiff: Progressive Fast Diffusion Model For High-Quality Text-to-Speech
Rongjie Huang
Zhou Zhao
Huadai Liu
Jinglin Liu
Chenye Cui
Yi Ren
DiffM
44
193
0
13 Jul 2022
CopyCat2: A Single Model for Multi-Speaker TTS and Many-to-Many Fine-Grained Prosody Transfer
S. Karlapati
Penny Karanasou
Mateusz Lajszczak
Ammar Abbas
Alexis Moinet
Peter Makarov
Raymond Li
Arent van Korlaar
Simon Slangen
Thomas Drugman
14
15
0
27 Jun 2022
Multimodal Learning with Transformers: A Survey
P. Xu
Xiatian Zhu
David A. Clifton
ViT
52
525
0
13 Jun 2022
Zero-Shot Voice Conditioning for Denoising Diffusion TTS Models
Alon Levkovitch
Eliya Nachmani
Lior Wolf
DiffM
19
29
0
05 Jun 2022
Score-Based Generative Models Detect Manifolds
Jakiw Pidstrigach
DiffM
27
71
0
02 Jun 2022
Improved Vector Quantized Diffusion Models
Zhicong Tang
Shuyang Gu
Jianmin Bao
Dong Chen
Fang Wen
DiffM
178
63
0
31 May 2022
Previous
1
2
3
Next