ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2104.01409
  4. Cited By
Diff-TTS: A Denoising Diffusion Model for Text-to-Speech

Diff-TTS: A Denoising Diffusion Model for Text-to-Speech

Interspeech (Interspeech), 2021
3 April 2021
Myeonghun Jeong
Hyeongju Kim
Sung Jun Cheon
Byoung Jin Choi
N. Kim
    DiffM
ArXiv (abs)PDFHTML

Papers citing "Diff-TTS: A Denoising Diffusion Model for Text-to-Speech"

50 / 150 papers shown
Noise Aggregation Analysis Driven by Small-Noise Injection: Efficient Membership Inference for Diffusion Models
Noise Aggregation Analysis Driven by Small-Noise Injection: Efficient Membership Inference for Diffusion Models
Guo Li
Yuyang Yu
Xuemiao Xu
DiffM
178
0
0
18 Oct 2025
An Octave-based Multi-Resolution CQT Architecture for Diffusion-based Audio Generation
An Octave-based Multi-Resolution CQT Architecture for Diffusion-based Audio Generation
Maurício do V. M. da Costa
Eloi Moliner
DiffM
229
1
0
20 Sep 2025
Length-Aware Rotary Position Embedding for Text-Speech Alignment
Length-Aware Rotary Position Embedding for Text-Speech Alignment
Hyeongju Kim
Juheon Lee
Jinhyeok Yang
Jacob Morton
AuLLM
126
1
0
14 Sep 2025
DiTReducio: A Training-Free Acceleration for DiT-Based TTS via Progressive Calibration
DiTReducio: A Training-Free Acceleration for DiT-Based TTS via Progressive Calibration
Yanru Huo
Ziyue Jiang
Zuoli Tang
Q. Hong
Zhou Zhao
179
1
0
11 Sep 2025
Navigating the Exploration-Exploitation Tradeoff in Inference-Time Scaling of Diffusion Models
Navigating the Exploration-Exploitation Tradeoff in Inference-Time Scaling of Diffusion Models
Xun Su
Jianming Huang
Yang Yusen
Zhongxi Fang
Hiroyuki Kasai
DiffM
239
2
0
17 Aug 2025
RapFlow-TTS: Rapid and High-Fidelity Text-to-Speech with Improved Consistency Flow Matching
RapFlow-TTS: Rapid and High-Fidelity Text-to-Speech with Improved Consistency Flow Matching
Hyun Joon Park
Jeongmin Liu
Jin Sob Kim
Jeong Yeol Yang
Sung Won Han
Eunwoo Song
244
1
0
20 Jun 2025
Audio Generation Through Score-Based Generative Modeling: Design Principles and Implementation
Ge Zhu
Yutong Wen
Zhiyao Duan
DiffMMedIm
326
3
0
10 Jun 2025
ZeroSep: Separate Anything in Audio with Zero Training
ZeroSep: Separate Anything in Audio with Zero Training
Chao Huang
Yuesheng Ma
J. Huang
Susan Liang
Yunlong Tang
Jing Bi
Wenqiang Liu
Nima Mesgarani
Chenliang Xu
DiffMVLM
333
5
0
29 May 2025
CloneShield: A Framework for Universal Perturbation Against Zero-Shot Voice Cloning
CloneShield: A Framework for Universal Perturbation Against Zero-Shot Voice Cloning
Renyuan Li
Zhibo Liang
Haichuan Zhang
Tianyu Shi
Zhiyuan Cheng
Jia Shi
Carl Yang
Mingjie Tang
AAML
423
2
0
25 May 2025
Constraint-Aware Diffusion Guidance for Robotics: Real-Time Obstacle Avoidance for Autonomous Racing
Constraint-Aware Diffusion Guidance for Robotics: Real-Time Obstacle Avoidance for Autonomous Racing
Hao Ma
Sabrina Bodmer
Andrea Carron
Melanie Zeilinger
Michael Muehlebach
247
3
0
19 May 2025
VoiceCloak: A Multi-Dimensional Defense Framework against Unauthorized Diffusion-based Voice Cloning
VoiceCloak: A Multi-Dimensional Defense Framework against Unauthorized Diffusion-based Voice Cloning
Qianyue Hu
Junyan Wu
Wei Lu
Xiangyang Luo
DiffMAAML
378
0
0
18 May 2025
Language translation, and change of accent for speech-to-speech task using diffusion model
Language translation, and change of accent for speech-to-speech task using diffusion model
Abhishek Mishra
Ritesh Sur Chowdhury
Vartul Bahuguna
Isha Pandey
Ganesh Ramakrishnan
DiffM
247
0
0
04 May 2025
Pseudo-Autoregressive Neural Codec Language Models for Efficient Zero-Shot Text-to-Speech Synthesis
Pseudo-Autoregressive Neural Codec Language Models for Efficient Zero-Shot Text-to-Speech Synthesis
Yifan Yang
Shixuan Liu
Jiajian Li
Yuxuan Hu
Haibin Wu
...
Haiyang Sun
Yanqing Liu
Yan Lu
Kai Yu
Xie Chen
423
9
0
14 Apr 2025
SlimSpeech: Lightweight and Efficient Text-to-Speech with Slim Rectified Flow
SlimSpeech: Lightweight and Efficient Text-to-Speech with Slim Rectified FlowIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2025
Kaidi Wang
Wenhao Guan
Shenghui Lu
Jianglong Yao
Lin Li
Q. Hong
497
4
0
10 Apr 2025
SupertonicTTS: Towards Highly Efficient and Streamlined Text-to-Speech System
SupertonicTTS: Towards Highly Efficient and Streamlined Text-to-Speech System
Hyeongju Kim
Jinhyeok Yang
Yechan Yu
Seunghun Ji
Jacob Morton
Frederik Bous
Joon Byun
Juheon Lee
531
1
0
29 Mar 2025
Dual Audio-Centric Modality Coupling for Talking Head Generation
Dual Audio-Centric Modality Coupling for Talking Head Generation
Ao Fu
Ziqi Ni
Yi Zhou
365
2
0
26 Mar 2025
MAVFlow: Preserving Paralinguistic Elements with Conditional Flow Matching for Zero-Shot AV2AV Multilingual Translation
MAVFlow: Preserving Paralinguistic Elements with Conditional Flow Matching for Zero-Shot AV2AV Multilingual Translation
Sungwoo Cho
J. Choi
Sungnyun Kim
Se-Young Yun
406
0
0
14 Mar 2025
AudioX: A Unified Framework for Anything-to-Audio Generation
AudioX: A Unified Framework for Anything-to-Audio Generation
Zeyue Tian
Yizhu Jin
Zhaoyang Liu
Ruibin Yuan
Xu Tan
Qifeng Chen
Wei Xue
Xu Tan
Yike Guo
VGen
575
33
0
13 Mar 2025
Survey on AI-Generated Media Detection: From Non-MLLM to MLLM
Survey on AI-Generated Media Detection: From Non-MLLM to MLLM
Yueying Zou
Peipei Li
Zekun Li
Huaibo Huang
Xing Cui
Xuannan Liu
Chenghanyu Zhang
Ran He
DeLMO
892
16
0
07 Feb 2025
UIBDiffusion: Universal Imperceptible Backdoor Attack for Diffusion Models
UIBDiffusion: Universal Imperceptible Backdoor Attack for Diffusion ModelsComputer Vision and Pattern Recognition (CVPR), 2024
Yuning Han
Bingyin Zhao
Rui Chu
Feng Luo
Biplab Sikdar
Yingjie Lao
DiffMAAML
680
6
0
16 Dec 2024
DiffStyleTTS: Diffusion-based Hierarchical Prosody Modeling for
  Text-to-Speech with Diverse and Controllable Styles
DiffStyleTTS: Diffusion-based Hierarchical Prosody Modeling for Text-to-Speech with Diverse and Controllable StylesInternational Conference on Computational Linguistics (COLING), 2024
Jiaxuan Liu
Zhaoci Liu
Yihan Hu
Yingying Gao
Shilei Zhang
Zhenhua Ling
DiffM
295
8
0
04 Dec 2024
A roadmap for generative mapping: unlocking the power of generative AI
  for map-making
A roadmap for generative mapping: unlocking the power of generative AI for map-making
Sidi Wu
Katharina Henggeler
Yizi Chen
L. Hurni
122
3
0
21 Oct 2024
Generative Co-Learners: Enhancing Cognitive and Social Presence of
  Students in Asynchronous Learning with Generative AI
Generative Co-Learners: Enhancing Cognitive and Social Presence of Students in Asynchronous Learning with Generative AI
Tianjia Wang
Tong Wu
Huayi Liu
Chris Brown
Yan Chen
213
2
0
06 Oct 2024
DPI-TTS: Directional Patch Interaction for Fast-Converging and Style
  Temporal Modeling in Text-to-Speech
DPI-TTS: Directional Patch Interaction for Fast-Converging and Style Temporal Modeling in Text-to-SpeechIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Xin Qi
Ruibo Fu
Zhengqi Wen
Tao Wang
Chunyu Qiang
...
Xiaopeng Wang
Yuankun Xie
Yukun Liu
Zhengqi Wen
Guanjun Li
DiffM
350
1
0
18 Sep 2024
Improving Robustness of Diffusion-Based Zero-Shot Speech Synthesis via Stable Formant Generation
Improving Robustness of Diffusion-Based Zero-Shot Speech Synthesis via Stable Formant GenerationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
C. Han
Seokgi Lee
Gyuhyeon Nam
Gyeongsu Chae
DiffM
1.1K
1
0
14 Sep 2024
DFADD: The Diffusion and Flow-Matching Based Audio Deepfake Dataset
DFADD: The Diffusion and Flow-Matching Based Audio Deepfake DatasetSpoken Language Technology Workshop (SLT), 2024
Jiawei Du
I-Ming Lin
I-Hsiang Chiu
Xuanjun Chen
Haibin Wu
Wenze Ren
Yu Tsao
Hung-yi Lee
Jyh-Shing Roger Jang
DiffM
305
23
0
13 Sep 2024
A Simple Early Exiting Framework for Accelerated Sampling in Diffusion
  Models
A Simple Early Exiting Framework for Accelerated Sampling in Diffusion ModelsInternational Conference on Machine Learning (ICML), 2024
Taehong Moon
Moonseok Choi
Eunggu Yun
Jongmin Yoon
Gayoung Lee
Jaewoong Cho
Juho Lee
285
9
0
12 Aug 2024
Central Kurdish Text-to-Speech Synthesis with Novel End-to-End
  Transformer Training
Central Kurdish Text-to-Speech Synthesis with Novel End-to-End Transformer Training
Hawraz A. Ahmad
Tarik A. Rashid
293
1
0
06 Aug 2024
Piecewise deterministic generative models
Piecewise deterministic generative modelsNeural Information Processing Systems (NeurIPS), 2024
Andrea Bertazzi
Alain Durmus
Dario Shariatian
Umut Simsekli
Éric Moulines
DiffM
247
4
0
28 Jul 2024
SlimFlow: Training Smaller One-Step Diffusion Models with Rectified Flow
SlimFlow: Training Smaller One-Step Diffusion Models with Rectified Flow
Yuanzhi Zhu
Xingchao Liu
Qiang Liu
349
29
0
17 Jul 2024
LiteFocus: Accelerated Diffusion Inference for Long Audio Synthesis
LiteFocus: Accelerated Diffusion Inference for Long Audio Synthesis
Zhenxiong Tan
Xinyin Ma
Gongfan Fang
Xinchao Wang
312
4
0
15 Jul 2024
Adaptive Compressed Sensing with Diffusion-Based Posterior Sampling
Adaptive Compressed Sensing with Diffusion-Based Posterior Sampling
Noam Elata
T. Michaeli
Michael Elad
DiffMMedIm
333
15
0
11 Jul 2024
DEX-TTS: Diffusion-based EXpressive Text-to-Speech with Style Modeling
  on Time Variability
DEX-TTS: Diffusion-based EXpressive Text-to-Speech with Style Modeling on Time Variability
Hyun Joon Park
Jin Sob Kim
Wooseok Shin
Sung Won Han
DiffM
217
7
0
27 Jun 2024
Flow map matching with stochastic interpolants: A mathematical framework for consistency models
Flow map matching with stochastic interpolants: A mathematical framework for consistency models
Nicholas M. Boffi
M. S. Albergo
Eric Vanden-Eijnden
277
6
0
11 Jun 2024
MakeSinger: A Semi-Supervised Training Method for Data-Efficient Singing
  Voice Synthesis via Classifier-free Diffusion Guidance
MakeSinger: A Semi-Supervised Training Method for Data-Efficient Singing Voice Synthesis via Classifier-free Diffusion GuidanceInterspeech (Interspeech), 2024
Semin Kim
Myeonghun Jeong
Hyeonseung Lee
Minchan Kim
Byoung Jin Choi
Nam Soo Kim
VLMDiffM
345
4
0
10 Jun 2024
Convergence of the denoising diffusion probabilistic models for general noise schedules
Convergence of the denoising diffusion probabilistic models for general noise schedules
Yumiharu Nakano
DiffM
699
2
0
03 Jun 2024
A Survey of Deep Learning Audio Generation Methods
A Survey of Deep Learning Audio Generation Methods
Matej Bozic
Marko Horvat
VLMMedIm
350
9
0
31 May 2024
Diff-ETS: Learning a Diffusion Probabilistic Model for
  Electromyography-to-Speech Conversion
Diff-ETS: Learning a Diffusion Probabilistic Model for Electromyography-to-Speech Conversion
Zhao Ren
Kevin Scheck
Qinhan Hou
Stefano van Gogh
Michael Wand
Tanja Schultz
DiffM
318
7
0
11 May 2024
FlashSpeech: Efficient Zero-Shot Speech Synthesis
FlashSpeech: Efficient Zero-Shot Speech Synthesis
Zhen Ye
Zeqian Ju
Haohe Liu
Xu Tan
Jianyi Chen
...
Weizhen Bian
Shulin He
Qi-fei Liu
Yi-Ting Guo
Wei Xue
366
32
0
23 Apr 2024
An Overview of Diffusion Models: Applications, Guided Generation,
  Statistical Rates and Optimization
An Overview of Diffusion Models: Applications, Guided Generation, Statistical Rates and Optimization
Minshuo Chen
Song Mei
Jianqing Fan
Mengdi Wang
VLMMedImDiffM
415
95
0
11 Apr 2024
CoVoMix: Advancing Zero-Shot Speech Generation for Human-like
  Multi-talker Conversations
CoVoMix: Advancing Zero-Shot Speech Generation for Human-like Multi-talker ConversationsNeural Information Processing Systems (NeurIPS), 2024
Leying Zhang
Yao Qian
Long Zhou
Shujie Liu
Dongmei Wang
...
Yanmin Qian
Jinyu Li
Lei He
Sheng Zhao
Michael Zeng
305
21
0
10 Apr 2024
CM-TTS: Enhancing Real Time Text-to-Speech Synthesis Efficiency through
  Weighted Samplers and Consistency Models
CM-TTS: Enhancing Real Time Text-to-Speech Synthesis Efficiency through Weighted Samplers and Consistency Models
Xiang Li
Fan Bu
Ambuj Mehrish
Yingting Li
Jiale Han
Bo Cheng
Soujanya Poria
DiffM
181
12
0
31 Mar 2024
GetMesh: A Controllable Model for High-quality Mesh Generation and
  Manipulation
GetMesh: A Controllable Model for High-quality Mesh Generation and Manipulation
Zhaoyang Lyu
Ben Fei
Jinyi Wang
Xudong Xu
Ya Zhang
Weidong Yang
Bo Dai
219
8
0
18 Mar 2024
Towards Faster Training of Diffusion Models: An Inspiration of A
  Consistency Phenomenon
Towards Faster Training of Diffusion Models: An Inspiration of A Consistency Phenomenon
Tianshuo Xu
Peng Mi
Ruilin Wang
Yingcong Chen
DiffM
383
16
0
14 Mar 2024
EM-TTS: Efficiently Trained Low-Resource Mongolian Lightweight
  Text-to-Speech
EM-TTS: Efficiently Trained Low-Resource Mongolian Lightweight Text-to-SpeechInternational Conference on Computer Supported Cooperative Work in Design (CSCWD), 2024
Ziqi Liang
Haoxiang Shi
Jiawei Wang
Keda Lu
313
0
0
13 Mar 2024
An Automated End-to-End Open-Source Software for High-Quality
  Text-to-Speech Dataset Generation
An Automated End-to-End Open-Source Software for High-Quality Text-to-Speech Dataset Generation
Ahmet Gunduz
K. Yuksel
Kareem Darwish
Golara Javadi
Fabio Minazzi
Nicola Sobieski
Sebastien Bratieres
191
1
0
26 Feb 2024
On the Semantic Latent Space of Diffusion-Based Text-to-Speech Models
On the Semantic Latent Space of Diffusion-Based Text-to-Speech Models
Miri Varshavsky-Hassid
Roy Hirsch
Regev Cohen
Tomer Golany
Daniel Freedman
Ehud Rivlin
275
4
0
19 Feb 2024
Speaking in Wavelet Domain: A Simple and Efficient Approach to Speed up
  Speech Diffusion Model
Speaking in Wavelet Domain: A Simple and Efficient Approach to Speed up Speech Diffusion Model
Xiangyu Zhang
Daijiao Liu
Hexin Liu
Qiquan Zhang
Hanyu Meng
Leibny Paola García
Chng Eng Siong
Lina Yao
DiffM
330
13
0
16 Feb 2024
Classification Diffusion Models: Revitalizing Density Ratio Estimation
Classification Diffusion Models: Revitalizing Density Ratio Estimation
Shahar Yadin
Noam Elata
T. Michaeli
DiffM
314
2
0
15 Feb 2024
Diff-RNTraj: A Structure-aware Diffusion Model for Road
  Network-constrained Trajectory Generation
Diff-RNTraj: A Structure-aware Diffusion Model for Road Network-constrained Trajectory GenerationIEEE Transactions on Knowledge and Data Engineering (TKDE), 2024
Tonglong Wei
Youfang Lin
Shengnan Guo
Yan Lin
YiHeng Huang
Chenyang Xiang
Yuqing Bai
Menglu Ya
Huaiyu Wan
225
36
0
12 Feb 2024
123
Next
Page 1 of 3