ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2105.06337
  4. Cited By
Grad-TTS: A Diffusion Probabilistic Model for Text-to-Speech

Grad-TTS: A Diffusion Probabilistic Model for Text-to-Speech

13 May 2021
Vadim Popov
Ivan Vovk
Vladimir Gogoryan
Tasnima Sadekova
Mikhail Kudinov
    DiffM
ArXivPDFHTML

Papers citing "Grad-TTS: A Diffusion Probabilistic Model for Text-to-Speech"

50 / 114 papers shown
Title
Language translation, and change of accent for speech-to-speech task using diffusion model
Language translation, and change of accent for speech-to-speech task using diffusion model
Abhishek Mishra
Ritesh Sur Chowdhury
Vartul Bahuguna
Isha Pandey
Ganesh Ramakrishnan
DiffM
44
0
0
04 May 2025
TriniMark: A Robust Generative Speech Watermarking Method for Trinity-Level Attribution
TriniMark: A Robust Generative Speech Watermarking Method for Trinity-Level Attribution
Yue Li
W. Liu
Dongdong Lin
42
0
0
29 Apr 2025
P2Mark: Plug-and-play Parameter-level Watermarking for Neural Speech Generation
P2Mark: Plug-and-play Parameter-level Watermarking for Neural Speech Generation
Yong Ren
Jiangyan Yi
Tao Wang
J. Tao
Zhengqi Wen
Chenxing Li
Z. Lian
Ruibo Fu
Ye Bai
Xiaohui Zhang
58
0
0
07 Apr 2025
On the Generalization Properties of Diffusion Models
On the Generalization Properties of Diffusion Models
Puheng Li
Zhong Li
Huishuai Zhang
Jiang Bian
74
29
0
13 Mar 2025
AudioX: Diffusion Transformer for Anything-to-Audio Generation
AudioX: Diffusion Transformer for Anything-to-Audio Generation
Zeyue Tian
Yizhu Jin
Zhaoyang Liu
Ruibin Yuan
Xu Tan
Qifeng Chen
Wei Xue
Y. Guo
67
3
0
13 Mar 2025
An Exhaustive Evaluation of TTS- and VC-based Data Augmentation for ASR
Sewade Ogun
Vincent Colotte
Emmanuel Vincent
59
0
0
11 Mar 2025
A Dual-Purpose Framework for Backdoor Defense and Backdoor Amplification in Diffusion Models
A Dual-Purpose Framework for Backdoor Defense and Backdoor Amplification in Diffusion Models
Vu Tuan Truong Long
Bao Le
DiffM
AAML
198
0
0
26 Feb 2025
Everyday Speech in the Indian Subcontinent
Everyday Speech in the Indian Subcontinent
Utkarsh Pathak
54
1
0
24 Feb 2025
RestoreGrad: Signal Restoration Using Conditional Denoising Diffusion Models with Jointly Learned Prior
RestoreGrad: Signal Restoration Using Conditional Denoising Diffusion Models with Jointly Learned Prior
Ching Hua Lee
Chouchang Yang
Jaejin Cho
Yashas Malur Saidutta
R. S. Srinivasa
Yilin Shen
Hongxia Jin
DiffM
85
0
0
19 Feb 2025
BackdoorDM: A Comprehensive Benchmark for Backdoor Learning in Diffusion Model
BackdoorDM: A Comprehensive Benchmark for Backdoor Learning in Diffusion Model
Weilin Lin
Nanjun Zhou
Y. Wang
Jianze Li
Hui Xiong
Li Liu
AAML
DiffM
172
0
0
17 Feb 2025
Less is More for Synthetic Speech Detection in the Wild
Less is More for Synthetic Speech Detection in the Wild
Ashi Garg
Zexin Cai
Henry Li Xinyuan
Leibny Paola García-Perera
Kevin Duh
Sanjeev Khudanpur
Matthew Wiesner
Nicholas Andrews
74
0
0
17 Feb 2025
Diffusion-Sharpening: Fine-tuning Diffusion Models with Denoising Trajectory Sharpening
Diffusion-Sharpening: Fine-tuning Diffusion Models with Denoising Trajectory Sharpening
Ye Tian
L. Yang
Xinchen Zhang
Yunhai Tong
Mengdi Wang
Bin Cui
67
1
0
17 Feb 2025
EmoReg: Directional Latent Vector Modeling for Emotional Intensity Regularization in Diffusion-based Voice Conversion
EmoReg: Directional Latent Vector Modeling for Emotional Intensity Regularization in Diffusion-based Voice Conversion
Ashishkumar Gudmalwar
Ishan D. Biyani
Nirmesh J. Shah
Pankaj Wasnik
R. Shah
DiffM
26
0
0
31 Dec 2024
UIBDiffusion: Universal Imperceptible Backdoor Attack for Diffusion Models
UIBDiffusion: Universal Imperceptible Backdoor Attack for Diffusion Models
Yuning Han
Bingyin Zhao
Rui Chu
Feng Luo
Biplab Sikdar
Yingjie Lao
DiffM
AAML
75
1
0
16 Dec 2024
SoftVQ-VAE: Efficient 1-Dimensional Continuous Tokenizer
SoftVQ-VAE: Efficient 1-Dimensional Continuous Tokenizer
H. Chen
Z. Wang
X. Li
X. Sun
Fangyi Chen
Jiang Liu
J. Wang
Bhiksha Raj
Zicheng Liu
Emad Barsoum
VLM
114
6
0
14 Dec 2024
EmoSphere++: Emotion-Controllable Zero-Shot Text-to-Speech via Emotion-Adaptive Spherical Vector
EmoSphere++: Emotion-Controllable Zero-Shot Text-to-Speech via Emotion-Adaptive Spherical Vector
Deok-Hyeon Cho
Hyung-Seok Oh
Seung-Bin Kim
Seong-Whan Lee
39
4
0
04 Nov 2024
TPC: Test-time Procrustes Calibration for Diffusion-based Human Image Animation
TPC: Test-time Procrustes Calibration for Diffusion-based Human Image Animation
Sunjae Yoon
Gwanhyeong Koo
Younghwan Lee
Chang-Dong Yoo
VGen
74
3
0
31 Oct 2024
SF-Speech: Straightened Flow for Zero-Shot Voice Clone
SF-Speech: Straightened Flow for Zero-Shot Voice Clone
Xuyuan Li
Zengqiang Shang
Hua Hua
Peiyang Shi
Chen Yang
Li Wang
Pengyuan Zhang
45
2
0
16 Oct 2024
Diffuse or Confuse: A Diffusion Deepfake Speech Dataset
Diffuse or Confuse: A Diffusion Deepfake Speech Dataset
Anton Firc
K. Malinka
P. Hanáček
DiffM
31
0
0
09 Oct 2024
SCOREQ: Speech Quality Assessment with Contrastive Regression
SCOREQ: Speech Quality Assessment with Contrastive Regression
Alessandro Ragano
Jan Skoglund
Andrew Hines
40
6
0
09 Oct 2024
FlowMAC: Conditional Flow Matching for Audio Coding at Low Bit Rates
FlowMAC: Conditional Flow Matching for Audio Coding at Low Bit Rates
N. Pia
Martin Strauss
M. Multrus
B. Edler
37
0
0
26 Sep 2024
A Comprehensive Survey with Critical Analysis for Deepfake Speech Detection
A Comprehensive Survey with Critical Analysis for Deepfake Speech Detection
Lam Pham
Phat Lam
Dat Tran
Hieu Tang
Tin Nguyen
Alexander Schindler
Canh Vu
Alexander Polonsky
Canh Vu
51
3
0
23 Sep 2024
ViolinDiff: Enhancing Expressive Violin Synthesis with Pitch Bend Conditioning
ViolinDiff: Enhancing Expressive Violin Synthesis with Pitch Bend Conditioning
Daewoong Kim
Hao-Wen Dong
Dasaem Jeong
23
0
0
19 Sep 2024
DPI-TTS: Directional Patch Interaction for Fast-Converging and Style
  Temporal Modeling in Text-to-Speech
DPI-TTS: Directional Patch Interaction for Fast-Converging and Style Temporal Modeling in Text-to-Speech
Xin Qi
Ruibo Fu
Zhengqi Wen
Tao Wang
Chunyu Qiang
...
Xiaopeng Wang
Yuankun Xie
Yukun Liu
Xuefei Liu
Guanjun Li
DiffM
28
0
0
18 Sep 2024
SpoofCeleb: Speech Deepfake Detection and SASV In The Wild
SpoofCeleb: Speech Deepfake Detection and SASV In The Wild
Jee-weon Jung
Yihan Wu
Xin Wang
Ji-Hoon Kim
Soumi Maiti
...
Joon Son Chung
Wangyou Zhang
Seyun Um
Shinnosuke Takamichi
Shinji Watanabe
65
1
0
18 Sep 2024
Speaker Contrastive Learning for Source Speaker Tracing
Speaker Contrastive Learning for Source Speaker Tracing
Qing Wang
Hongmei Guo
Jian Kang
Mengjie Du
Jie Li
Xiao-Lei Zhang
Lei Xie
25
0
0
16 Sep 2024
Improving Robustness of Diffusion-Based Zero-Shot Speech Synthesis via Stable Formant Generation
Improving Robustness of Diffusion-Based Zero-Shot Speech Synthesis via Stable Formant Generation
C. Han
Seokgi Lee
Gyuhyeon Nam
Gyeongsu Chae
DiffM
121
0
0
14 Sep 2024
FLEURS-R: A Restored Multilingual Speech Corpus for Generation Tasks
FLEURS-R: A Restored Multilingual Speech Corpus for Generation Tasks
Min Ma
Yuma Koizumi
Shigeki Karita
Heiga Zen
Jason Riesa
Haruko Ishikawa
M. Bacchiani
VLM
29
4
0
12 Aug 2024
ADD 2023: Towards Audio Deepfake Detection and Analysis in the Wild
ADD 2023: Towards Audio Deepfake Detection and Analysis in the Wild
Jiangyan Yi
Chu Yuan Zhang
Jianhua Tao
Chenglong Wang
Xinrui Yan
Yong Ren
Hao Gu
Junzuo Zhou
50
1
0
09 Aug 2024
Attacks and Defenses for Generative Diffusion Models: A Comprehensive
  Survey
Attacks and Defenses for Generative Diffusion Models: A Comprehensive Survey
V. T. Truong
Luan Ba Dang
Long Bao Le
DiffM
MedIm
50
16
0
06 Aug 2024
Bailing-TTS: Chinese Dialectal Speech Synthesis Towards Human-like
  Spontaneous Representation
Bailing-TTS: Chinese Dialectal Speech Synthesis Towards Human-like Spontaneous Representation
Xinhan Di
Jiahao Lu
Yunming Liang
Junjie Zheng
Yihua Wang
Chaofan Ding
ALM
33
1
0
01 Aug 2024
GROOT: Generating Robust Watermark for Diffusion-Model-Based Audio
  Synthesis
GROOT: Generating Robust Watermark for Diffusion-Model-Based Audio Synthesis
Weizhi Liu
Yue Li
Dongdong Lin
Hui Tian
Haizhou Li
WIGM
34
8
0
15 Jul 2024
ScoreFusion: Fusing Score-based Generative Models via Kullback-Leibler Barycenters
ScoreFusion: Fusing Score-based Generative Models via Kullback-Leibler Barycenters
Hao Liu
Junze Tony Ye
Ye
Jose H. Blanchet
DiffM
FedML
36
1
0
28 Jun 2024
Towards Zero-Shot Text-To-Speech for Arabic Dialects
Towards Zero-Shot Text-To-Speech for Arabic Dialects
Khai Duy Doan
Abdul Waheed
Muhammad Abdul-Mageed
38
0
0
24 Jun 2024
Towards Expressive Zero-Shot Speech Synthesis with Hierarchical Prosody
  Modeling
Towards Expressive Zero-Shot Speech Synthesis with Hierarchical Prosody Modeling
Yuepeng Jiang
Tao Li
Fengyu Yang
Lei Xie
Meng Meng
Yujun Wang
38
2
0
09 Jun 2024
Should you use a probabilistic duration model in TTS? Probably!
  Especially for spontaneous speech
Should you use a probabilistic duration model in TTS? Probably! Especially for spontaneous speech
Shivam Mehta
Harm Lameris
Rajiv Punmiya
Jonas Beskow
Éva Székely
G. Henter
23
1
0
08 Jun 2024
Fake it to make it: Using synthetic data to remedy the data shortage in
  joint multimodal speech-and-gesture synthesis
Fake it to make it: Using synthetic data to remedy the data shortage in joint multimodal speech-and-gesture synthesis
Shivam Mehta
Anna Deichler
Jim O'Regan
Birger Moëll
Jonas Beskow
G. Henter
Simon Alexanderson
46
4
0
30 Apr 2024
EM-TTS: Efficiently Trained Low-Resource Mongolian Lightweight
  Text-to-Speech
EM-TTS: Efficiently Trained Low-Resource Mongolian Lightweight Text-to-Speech
Ziqi Liang
Haoxiang Shi
Jiawei Wang
Keda Lu
35
0
0
13 Mar 2024
Trajectory Consistency Distillation: Improved Latent Consistency
  Distillation by Semi-Linear Consistency Function with Trajectory Mapping
Trajectory Consistency Distillation: Improved Latent Consistency Distillation by Semi-Linear Consistency Function with Trajectory Mapping
Jianbin Zheng
Minghui Hu
Zhongyi Fan
Chaoyue Wang
Changxing Ding
Dacheng Tao
Tat-Jen Cham
43
26
0
29 Feb 2024
Contextualized Diffusion Models for Text-Guided Image and Video
  Generation
Contextualized Diffusion Models for Text-Guided Image and Video Generation
Ling Yang
Zhilong Zhang
Zhaochen Yu
Jingwei Liu
Minkai Xu
Stefano Ermon
Bin Cui
41
4
0
26 Feb 2024
Daisy-TTS: Simulating Wider Spectrum of Emotions via Prosody Embedding
  Decomposition
Daisy-TTS: Simulating Wider Spectrum of Emotions via Prosody Embedding Decomposition
Rendi Chevi
Alham Fikri Aji
25
2
0
22 Feb 2024
Bringing Generative AI to Adaptive Learning in Education
Bringing Generative AI to Adaptive Learning in Education
Hang Li
Tianlong Xu
Chaoli Zhang
Eason Chen
Jing Liang
Xing Fan
Haoyang Li
Jiliang Tang
Qingsong Wen
45
20
0
02 Feb 2024
VALL-T: Decoder-Only Generative Transducer for Robust and Decoding-Controllable Text-to-Speech
VALL-T: Decoder-Only Generative Transducer for Robust and Decoding-Controllable Text-to-Speech
Chenpeng Du
Yiwei Guo
Hankun Wang
Yifan Yang
Zhikang Niu
Shuai Wang
Hui Zhang
Xie Chen
Kai Yu
VLM
22
25
0
25 Jan 2024
SonicVisionLM: Playing Sound with Vision Language Models
SonicVisionLM: Playing Sound with Vision Language Models
Zhifeng Xie
Shengye Yu
Qile He
Mengtian Li
VLM
VGen
28
2
0
09 Jan 2024
Optimizing Diffusion Noise Can Serve As Universal Motion Priors
Optimizing Diffusion Noise Can Serve As Universal Motion Priors
Korrawe Karunratanakul
Konpat Preechakul
Emre Aksan
Thabo Beeler
Supasorn Suwajanakorn
Siyu Tang
DiffM
31
37
0
19 Dec 2023
Towards Detailed Text-to-Motion Synthesis via Basic-to-Advanced
  Hierarchical Diffusion Model
Towards Detailed Text-to-Motion Synthesis via Basic-to-Advanced Hierarchical Diffusion Model
Zhenyu Xie
Yang Wu
Xuehao Gao
Zhongqian Sun
Wei Yang
Xiaodan Liang
DiffM
29
11
0
18 Dec 2023
Investigating the Design Space of Diffusion Models for Speech
  Enhancement
Investigating the Design Space of Diffusion Models for Speech Enhancement
Philippe Gonzalez
Zheng-Hua Tan
Jan Østergaard
Jesper Jensen
T. S. Alstrøm
Tobias May
DiffM
27
6
0
07 Dec 2023
DiffusionSat: A Generative Foundation Model for Satellite Imagery
DiffusionSat: A Generative Foundation Model for Satellite Imagery
Samar Khanna
Patrick Liu
Linqi Zhou
Chenlin Meng
Robin Rombach
Marshall Burke
David B. Lobell
Stefano Ermon
24
57
0
06 Dec 2023
Analyzing and Improving the Training Dynamics of Diffusion Models
Analyzing and Improving the Training Dynamics of Diffusion Models
Tero Karras
M. Aittala
J. Lehtinen
Janne Hellsten
Timo Aila
S. Laine
28
155
0
05 Dec 2023
DeepCache: Accelerating Diffusion Models for Free
DeepCache: Accelerating Diffusion Models for Free
Xinyin Ma
Gongfan Fang
Xinchao Wang
22
122
0
01 Dec 2023
123
Next