ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2207.06389
  4. Cited By
ProDiff: Progressive Fast Diffusion Model For High-Quality
  Text-to-Speech

ProDiff: Progressive Fast Diffusion Model For High-Quality Text-to-Speech

13 July 2022
Rongjie Huang
Zhou Zhao
Huadai Liu
Jinglin Liu
Chenye Cui
Yi Ren
    DiffM
ArXivPDFHTML

Papers citing "ProDiff: Progressive Fast Diffusion Model For High-Quality Text-to-Speech"

50 / 126 papers shown
Title
Breaking Free: How to Hack Safety Guardrails in Black-Box Diffusion
  Models!
Breaking Free: How to Hack Safety Guardrails in Black-Box Diffusion Models!
Shashank Kotyan
Poyuan Mao
Pin-Yu Chen
Danilo Vasconcellos Vargas
AAML
DiffM
32
0
0
07 Feb 2024
Real3D-Portrait: One-shot Realistic 3D Talking Portrait Synthesis
Real3D-Portrait: One-shot Realistic 3D Talking Portrait Synthesis
Zhenhui Ye
Tianyun Zhong
Yi Ren
Jiaqi Yang
Weichuang Li
...
Jinglin Liu
Chen Zhang
Xiang Yin
Zejun Ma
Zhou Zhao
24
44
0
16 Jan 2024
ELLA-V: Stable Neural Codec Language Modeling with Alignment-guided
  Sequence Reordering
ELLA-V: Stable Neural Codec Language Modeling with Alignment-guided Sequence Reordering
Ya-Zhen Song
Zhuo Chen
Xiaofei Wang
Ziyang Ma
Xie Chen
AuLLM
11
35
0
14 Jan 2024
TransFace: Unit-Based Audio-Visual Speech Synthesizer for Talking Head
  Translation
TransFace: Unit-Based Audio-Visual Speech Synthesizer for Talking Head Translation
Xize Cheng
Rongjie Huang
Linjun Li
Tao Jin
Zehan Wang
Aoxiong Yin
Minglei Li
Xinyu Duan
Changpeng Yang
Zhou Zhao
28
2
0
23 Dec 2023
MM-TTS: Multi-modal Prompt based Style Transfer for Expressive
  Text-to-Speech Synthesis
MM-TTS: Multi-modal Prompt based Style Transfer for Expressive Text-to-Speech Synthesis
Wenhao Guan
Yishuang Li
Tao Li
Hukai Huang
Feng Wang
Jiayan Lin
Lingyan Huang
Lin Li
Q. Hong
21
8
0
17 Dec 2023
StyleSinger: Style Transfer for Out-of-Domain Singing Voice Synthesis
StyleSinger: Style Transfer for Out-of-Domain Singing Voice Synthesis
Yu Zhang
Rongjie Huang
Ruiqi Li
Jinzheng He
Yan Xia
Feiyang Chen
Xinyu Duan
Baoxing Huai
Zhou Zhao
VLM
8
17
0
17 Dec 2023
Schrodinger Bridges Beat Diffusion Models on Text-to-Speech Synthesis
Schrodinger Bridges Beat Diffusion Models on Text-to-Speech Synthesis
Zehua Chen
Guande He
Kaiwen Zheng
Xu Tan
Jun Zhu
DiffM
39
21
0
06 Dec 2023
Synthetic Shifts to Initial Seed Vector Exposes the Brittle Nature of
  Latent-Based Diffusion Models
Synthetic Shifts to Initial Seed Vector Exposes the Brittle Nature of Latent-Based Diffusion Models
Poyuan Mao
Shashank Kotyan
Tham Yik Foong
Danilo Vasconcellos Vargas
20
5
0
24 Nov 2023
TrainerAgent: Customizable and Efficient Model Training through
  LLM-Powered Multi-Agent System
TrainerAgent: Customizable and Efficient Model Training through LLM-Powered Multi-Agent System
Haoyuan Li
Hao Jiang
Tianke Zhang
Zhelun Yu
Aoxiong Yin
Hao Cheng
Siming Fu
Yuhao Zhang
Wanggui He
LLMAG
14
4
0
11 Nov 2023
Patch-based Selection and Refinement for Early Object Detection
Patch-based Selection and Refinement for Early Object Detection
Tianyi Zhang
Kishore Kasichainula
Yaoxin Zhuo
Baoxin Li
Jae-sun Seo
Yu Cao
20
5
0
03 Nov 2023
A Wireless AI-Generated Content (AIGC) Provisioning Framework Empowered
  by Semantic Communication
A Wireless AI-Generated Content (AIGC) Provisioning Framework Empowered by Semantic Communication
Runze Cheng
Yao Sun
Dusit Niyato
Lan Zhang
Lei Zhang
Muhammad Ali Imran
13
11
0
26 Oct 2023
BayesDiff: Estimating Pixel-wise Uncertainty in Diffusion via Bayesian
  Inference
BayesDiff: Estimating Pixel-wise Uncertainty in Diffusion via Bayesian Inference
Siqi Kou
Lei Gan
Dequan Wang
Chongxuan Li
Zhijie Deng
BDL
DiffM
10
7
0
17 Oct 2023
Realistic Speech-to-Face Generation with Speech-Conditioned Latent
  Diffusion Model with Face Prior
Realistic Speech-to-Face Generation with Speech-Conditioned Latent Diffusion Model with Face Prior
Jinting Wang
Li Liu
Jun Wang
Hei Victor Cheng
DiffM
13
2
0
05 Oct 2023
DiffAR: Denoising Diffusion Autoregressive Model for Raw Speech Waveform
  Generation
DiffAR: Denoising Diffusion Autoregressive Model for Raw Speech Waveform Generation
Roi Benita
Michael Elad
Joseph Keshet
DiffM
17
7
0
02 Oct 2023
ReFlow-TTS: A Rectified Flow Model for High-fidelity Text-to-Speech
ReFlow-TTS: A Rectified Flow Model for High-fidelity Text-to-Speech
Wenhao Guan
Qi Su
Haodong Zhou
Shiyu Miao
Xingjia Xie
Lin Li
Q. Hong
DiffM
8
13
0
29 Sep 2023
Diffusion Methods for Generating Transition Paths
Diffusion Methods for Generating Transition Paths
Luke Triplett
Jianfeng Lu
17
5
0
19 Sep 2023
Contrastive Latent Space Reconstruction Learning for Audio-Text
  Retrieval
Contrastive Latent Space Reconstruction Learning for Audio-Text Retrieval
Kaiyi Luo
Xulong Zhang
Jianzong Wang
Huaxiong Li
Ning Cheng
Jing Xiao
61
2
0
16 Sep 2023
DCTTS: Discrete Diffusion Model with Contrastive Learning for
  Text-to-speech Generation
DCTTS: Discrete Diffusion Model with Contrastive Learning for Text-to-speech Generation
Zhichao Wu
Qiulin Li
Sixing Liu
Qun Yang
13
3
0
13 Sep 2023
VoiceFlow: Efficient Text-to-Speech with Rectified Flow Matching
VoiceFlow: Efficient Text-to-Speech with Rectified Flow Matching
Yiwei Guo
Chenpeng Du
Ziyang Ma
Xie Chen
K. Yu
DiffM
21
36
0
10 Sep 2023
Prefix-diffusion: A Lightweight Diffusion Model for Diverse Image
  Captioning
Prefix-diffusion: A Lightweight Diffusion Model for Diverse Image Captioning
Guisheng Liu
Yi Li
Zhengcong Fei
Haiyan Fu
Xiangyang Luo
Yanqing Guo
VLM
DiffM
15
5
0
10 Sep 2023
Matcha-TTS: A fast TTS architecture with conditional flow matching
Matcha-TTS: A fast TTS architecture with conditional flow matching
Shivam Mehta
Ruibo Tu
Jonas Beskow
Éva Székely
G. Henter
6
68
0
06 Sep 2023
An Efficient Temporary Deepfake Location Approach Based Embeddings for
  Partially Spoofed Audio Detection
An Efficient Temporary Deepfake Location Approach Based Embeddings for Partially Spoofed Audio Detection
Yuankun Xie
Haonan Cheng
Yutian Wang
Long Ye
14
6
0
06 Sep 2023
DiCLET-TTS: Diffusion Model based Cross-lingual Emotion Transfer for
  Text-to-Speech -- A Study between English and Mandarin
DiCLET-TTS: Diffusion Model based Cross-lingual Emotion Transfer for Text-to-Speech -- A Study between English and Mandarin
Tao Li
Chenxu Hu
Jian Cong
Xinfa Zhu
Jingbei Li
Qiao Tian
Yuping Wang
Linfu Xie
DiffM
17
8
0
02 Sep 2023
DiffusionVMR: Diffusion Model for Joint Video Moment Retrieval and
  Highlight Detection
DiffusionVMR: Diffusion Model for Joint Video Moment Retrieval and Highlight Detection
Henghao Zhao
Kevin Qinghong Lin
Rui Yan
Zechao Li
VGen
DiffM
23
1
0
29 Aug 2023
Diffusion-based Image Translation with Label Guidance for Domain
  Adaptive Semantic Segmentation
Diffusion-based Image Translation with Label Guidance for Domain Adaptive Semantic Segmentation
Duo Peng
Ping Hu
Qiuhong Ke
J. Liu
DiffM
25
20
0
23 Aug 2023
A Survey on Deep Multi-modal Learning for Body Language Recognition and
  Generation
A Survey on Deep Multi-modal Learning for Body Language Recognition and Generation
Li Liu
Lufei Gao
Wen-Ling Lei
Fengji Ma
Xiaotian Lin
Jin-Tao Wang
CVBM
19
5
0
17 Aug 2023
Degeneration-Tuning: Using Scrambled Grid shield Unwanted Concepts from
  Stable Diffusion
Degeneration-Tuning: Using Scrambled Grid shield Unwanted Concepts from Stable Diffusion
Zixuan Ni
Longhui Wei
Jiacheng Li
Siliang Tang
Yueting Zhuang
Qi Tian
DiffM
15
21
0
02 Aug 2023
DiffPose: SpatioTemporal Diffusion Model for Video-Based Human Pose
  Estimation
DiffPose: SpatioTemporal Diffusion Model for Video-Based Human Pose Estimation
Runyang Feng
Yixing Gao
Tze Ho Elden Tse
Xu Ma
H. Chang
DiffM
29
23
0
31 Jul 2023
DisCover: Disentangled Music Representation Learning for Cover Song
  Identification
DisCover: Disentangled Music Representation Learning for Cover Song Identification
Jiahao Xun
Shengyu Zhang
Yanting Yang
Jieming Zhu
Liqun Deng
Zhou Zhao
Zhenhua Dong
Ruiqi Li
Lichao Zhang
Fei Wu
AAML
DRL
15
5
0
19 Jul 2023
Gloss Attention for Gloss-free Sign Language Translation
Gloss Attention for Gloss-free Sign Language Translation
Aoxiong Yin
Tianyun Zhong
Lilian H. Y. Tang
Weike Jin
Tao Jin
Zhou Zhao
SLR
14
36
0
14 Jul 2023
The Ethical Implications of Generative Audio Models: A Systematic
  Literature Review
The Ethical Implications of Generative Audio Models: A Systematic Literature Review
J. Barnett
12
23
0
07 Jul 2023
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion
  and Adversarial Training with Large Speech Language Models
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
Yinghao Aaron Li
Cong Han
Vinay S. Raghavan
Gavin Mischler
N. Mesgarani
VLM
DiffM
21
107
0
13 Jun 2023
Interpretable Style Transfer for Text-to-Speech with ControlVAE and
  Diffusion Bridge
Interpretable Style Transfer for Text-to-Speech with ControlVAE and Diffusion Bridge
Wenhao Guan
Tao Li
Yishuang Li
Hukai Huang
Q. Hong
Lin Li
DiffM
16
6
0
07 Jun 2023
Mega-TTS: Zero-Shot Text-to-Speech at Scale with Intrinsic Inductive
  Bias
Mega-TTS: Zero-Shot Text-to-Speech at Scale with Intrinsic Inductive Bias
Ziyue Jiang
Yi Ren
Zhe Ye
Jinglin Liu
Chen Zhang
...
Rongjie Huang
Chunfeng Wang
Xiang Yin
Zejun Ma
Zhou Zhao
DiffM
21
72
0
06 Jun 2023
EmoMix: Emotion Mixing via Diffusion Models for Emotional Speech
  Synthesis
EmoMix: Emotion Mixing via Diffusion Models for Emotional Speech Synthesis
Haobin Tang
Xulong Zhang
Jianzong Wang
Ning Cheng
Jing Xiao
DiffM
14
22
0
01 Jun 2023
Conditional Score Guidance for Text-Driven Image-to-Image Translation
Conditional Score Guidance for Text-Driven Image-to-Image Translation
Hyunsoo Lee
Minsoo Kang
Bohyung Han
DiffM
11
14
0
29 May 2023
Diverse and Expressive Speech Prosody Prediction with Denoising
  Diffusion Probabilistic Model
Diverse and Expressive Speech Prosody Prediction with Denoising Diffusion Probabilistic Model
Xiang Li
Songxiang Liu
Max W. Y. Lam
Zhiyong Wu
Chao Weng
H. Meng
DiffM
21
6
0
26 May 2023
Confronting Ambiguity in 6D Object Pose Estimation via Score-Based
  Diffusion on SE(3)
Confronting Ambiguity in 6D Object Pose Estimation via Score-Based Diffusion on SE(3)
Tsu-Ching Hsiao
Haoming Chen
Hsuan-Kung Yang
Chun-Yi Lee
DiffM
18
6
0
25 May 2023
AV-TranSpeech: Audio-Visual Robust Speech-to-Speech Translation
AV-TranSpeech: Audio-Visual Robust Speech-to-Speech Translation
Rongjie Huang
Huadai Liu
Xize Cheng
Yi Ren
Lin Li
...
Jinzheng He
Lichao Zhang
Jinglin Liu
Xiaoyue Yin
Zhou Zhao
44
8
0
24 May 2023
FluentSpeech: Stutter-Oriented Automatic Speech Editing with
  Context-Aware Diffusion Models
FluentSpeech: Stutter-Oriented Automatic Speech Editing with Context-Aware Diffusion Models
Ziyue Jiang
Qiang Yang
Jia-li Zuo
Zhe Ye
Rongjie Huang
Yixiang Ren
Zhou Zhao
DiffM
62
13
0
23 May 2023
ViT-TTS: Visual Text-to-Speech with Scalable Diffusion Transformer
ViT-TTS: Visual Text-to-Speech with Scalable Diffusion Transformer
Huadai Liu
Rongjie Huang
Xuan Lin
Wenqiang Xu
Maozong Zheng
Hong Chen
Jinzheng He
Zhou Zhao
DiffM
21
20
0
22 May 2023
Wav2SQL: Direct Generalizable Speech-To-SQL Parsing
Wav2SQL: Direct Generalizable Speech-To-SQL Parsing
Huadai Liu
Rongjie Huang
Jinzheng He
Gang Sun
Ran Shen
Xize Cheng
Zhou Zhao
17
3
0
21 May 2023
CLAPSpeech: Learning Prosody from Text Context with Contrastive
  Language-Audio Pre-training
CLAPSpeech: Learning Prosody from Text Context with Contrastive Language-Audio Pre-training
Zhe Ye
Rongjie Huang
Yi Ren
Ziyue Jiang
Jinglin Liu
Jinzheng He
Xiang Yin
Zhou Zhao
CLIP
18
18
0
18 May 2023
CoMoSpeech: One-Step Speech and Singing Voice Synthesis via Consistency
  Model
CoMoSpeech: One-Step Speech and Singing Voice Synthesis via Consistency Model
Zhe Ye
Wei Xue
Xuejiao Tan
Jie Chen
Qi-fei Liu
Yi-Ting Guo
DiffM
17
40
0
11 May 2023
AlignSTS: Speech-to-Singing Conversion via Cross-Modal Alignment
AlignSTS: Speech-to-Singing Conversion via Cross-Modal Alignment
Ruiqi Li
Rongjie Huang
Lichao Zhang
Jinglin Liu
Zhou Zhao
23
4
0
08 May 2023
HiFi-Codec: Group-residual Vector quantization for High Fidelity Audio
  Codec
HiFi-Codec: Group-residual Vector quantization for High Fidelity Audio Codec
Dongchao Yang
Songxiang Liu
Rongjie Huang
Jinchuan Tian
Chao Weng
Yuexian Zou
138
118
0
04 May 2023
A Comprehensive Survey on Knowledge Distillation of Diffusion Models
A Comprehensive Survey on Knowledge Distillation of Diffusion Models
Weijian Luo
DiffM
MedIm
20
33
0
09 Apr 2023
DATE: Domain Adaptive Product Seeker for E-commerce
DATE: Domain Adaptive Product Seeker for E-commerce
Haoyuan Li
Haojie Jiang
Tao Jin
Meng-Juan Li
Yan Chen
Zhijie Lin
Yang Zhao
Zhou Zhao
18
6
0
07 Apr 2023
Physics-Driven Diffusion Models for Impact Sound Synthesis from Videos
Physics-Driven Diffusion Models for Impact Sound Synthesis from Videos
Kun Su
Kaizhi Qian
Eli Shlizerman
Antonio Torralba
Chuang Gan
VGen
AI4CE
27
19
0
29 Mar 2023
DiffTAD: Temporal Action Detection with Proposal Denoising Diffusion
DiffTAD: Temporal Action Detection with Proposal Denoising Diffusion
Sauradip Nag
Xiatian Zhu
Jiankang Deng
Yi-Zhe Song
Tao Xiang
DiffM
VGen
25
21
0
27 Mar 2023
Previous
123
Next