ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2204.09934
  4. Cited By
FastDiff: A Fast Conditional Diffusion Model for High-Quality Speech
  Synthesis

FastDiff: A Fast Conditional Diffusion Model for High-Quality Speech Synthesis

21 April 2022
Rongjie Huang
Max W. Y. Lam
J. Wang
Dan Su
Dong Yu
Yi Ren
Zhou Zhao
    DiffM
ArXivPDFHTML

Papers citing "FastDiff: A Fast Conditional Diffusion Model for High-Quality Speech Synthesis"

50 / 112 papers shown
Title
TriniMark: A Robust Generative Speech Watermarking Method for Trinity-Level Attribution
TriniMark: A Robust Generative Speech Watermarking Method for Trinity-Level Attribution
Yue Li
W. Liu
Dongdong Lin
42
0
0
29 Apr 2025
Dysarthria Normalization via Local Lie Group Transformations for Robust ASR
Dysarthria Normalization via Local Lie Group Transformations for Robust ASR
Mikhail Osipov
41
0
0
16 Apr 2025
DiffuMural: Restoring Dunhuang Murals with Multi-scale Diffusion
DiffuMural: Restoring Dunhuang Murals with Multi-scale Diffusion
Puyu Han
Jiaju Kang
Yuhang Pan
Erting Pan
Zeyu Zhang
Qunchao Jin
Juntao Jiang
Zhichen Liu
Luqi Gong
29
0
0
13 Apr 2025
On the Design of Diffusion-based Neural Speech Codecs
On the Design of Diffusion-based Neural Speech Codecs
Pietro Foti
Andreas Brendel
DiffM
34
0
0
11 Apr 2025
WaveFM: A High-Fidelity and Efficient Vocoder Based on Flow Matching
WaveFM: A High-Fidelity and Efficient Vocoder Based on Flow Matching
Tianze Luo
Xingchen Miao
Wenbo Duan
DiffM
37
0
0
20 Mar 2025
Shushing! Let's Imagine an Authentic Speech from the Silent Video
Shushing! Let's Imagine an Authentic Speech from the Silent Video
Jiaxin Ye
Hongming Shan
DiffM
VGen
61
1
0
19 Mar 2025
Bayesian Computation in Deep Learning
Bayesian Computation in Deep Learning
Wenlong Chen
Bolian Li
Ruqi Zhang
Yingzhen Li
BDL
70
0
0
25 Feb 2025
UIBDiffusion: Universal Imperceptible Backdoor Attack for Diffusion Models
UIBDiffusion: Universal Imperceptible Backdoor Attack for Diffusion Models
Yuning Han
Bingyin Zhao
Rui Chu
Feng Luo
Biplab Sikdar
Yingjie Lao
DiffM
AAML
67
1
0
16 Dec 2024
Deepfake Media Generation and Detection in the Generative AI Era: A
  Survey and Outlook
Deepfake Media Generation and Detection in the Generative AI Era: A Survey and Outlook
Florinel-Alin Croitoru
Andrei Iulian Hiji
Vlad Hondru
Nicolae-Cătălin Ristea
Paul Irofti
Marius Popescu
Cristian Rusu
Radu Tudor Ionescu
F. Khan
Mubarak Shah
82
2
0
29 Nov 2024
Accelerating Codec-based Speech Synthesis with Multi-Token Prediction
  and Speculative Decoding
Accelerating Codec-based Speech Synthesis with Multi-Token Prediction and Speculative Decoding
Tan Dat Nguyen
Ji-Hoon Kim
Jeongsoo Choi
Shukjae Choi
Jinseok Park
Younglo Lee
Joon Son Chung
26
0
0
17 Oct 2024
DPI-TTS: Directional Patch Interaction for Fast-Converging and Style
  Temporal Modeling in Text-to-Speech
DPI-TTS: Directional Patch Interaction for Fast-Converging and Style Temporal Modeling in Text-to-Speech
Xin Qi
Ruibo Fu
Zhengqi Wen
Tao Wang
Chunyu Qiang
...
Xiaopeng Wang
Yuankun Xie
Yukun Liu
Xuefei Liu
Guanjun Li
DiffM
23
0
0
18 Sep 2024
USTC-KXDIGIT System Description for ASVspoof5 Challenge
USTC-KXDIGIT System Description for ASVspoof5 Challenge
Y. Chen
Haochen Wu
Nan Jiang
Xiang Xia
Qing Gu
...
Sian Fang
Yan Song
Wu Guo
Lin Liu
Minqiang Xu
34
1
0
03 Sep 2024
Accelerating High-Fidelity Waveform Generation via Adversarial Flow
  Matching Optimization
Accelerating High-Fidelity Waveform Generation via Adversarial Flow Matching Optimization
Sang-Hoon Lee
Ha-Yeong Choi
Seong-Whan Lee
AI4TS
27
1
0
15 Aug 2024
PeriodWave: Multi-Period Flow Matching for High-Fidelity Waveform
  Generation
PeriodWave: Multi-Period Flow Matching for High-Fidelity Waveform Generation
Sang-Hoon Lee
Ha-Yeong Choi
Seong-Whan Lee
OOD
DiffM
AI4TS
43
5
0
14 Aug 2024
VNet: A GAN-based Multi-Tier Discriminator Network for Speech Synthesis
  Vocoders
VNet: A GAN-based Multi-Tier Discriminator Network for Speech Synthesis Vocoders
Yubing Cao
Yongming Li
Liejun Wang
Yinfeng Yu
23
0
0
13 Aug 2024
Semantic Codebook Learning for Dynamic Recommendation Models
Semantic Codebook Learning for Dynamic Recommendation Models
Zheqi Lv
Shaoxuan He
Ahmed Salem
Minxing Zhang
Wenqiao Zhang
Jingyuan Chen
Yang Zhang
Fei Wu
21
5
0
31 Jul 2024
ASRRL-TTS: Agile Speaker Representation Reinforcement Learning for
  Text-to-Speech Speaker Adaptation
ASRRL-TTS: Agile Speaker Representation Reinforcement Learning for Text-to-Speech Speaker Adaptation
Ruibo Fu
Xin Qi
Zhengqi Wen
Jianhua Tao
Tao Wang
...
Xiaopeng Wang
Shuchen Shi
Yukun Liu
Xuefei Liu
Shuai Zhang
47
0
0
07 Jul 2024
TacoLM: GaTed Attention Equipped Codec Language Model are Efficient
  Zero-Shot Text to Speech Synthesizers
TacoLM: GaTed Attention Equipped Codec Language Model are Efficient Zero-Shot Text to Speech Synthesizers
Yakun Song
Zhuo Chen
Xiaofei Wang
Ziyang Ma
Guanrou Yang
Xie Chen
AuLLM
30
3
0
22 Jun 2024
Diffusion Gaussian Mixture Audio Denoise
Diffusion Gaussian Mixture Audio Denoise
Pu Wang
Junhui Li
Jialu Li
Liangdong Guo
Youshan Zhang
DiffM
29
0
0
13 Jun 2024
MeLFusion: Synthesizing Music from Image and Language Cues using
  Diffusion Models
MeLFusion: Synthesizing Music from Image and Language Cues using Diffusion Models
Sanjoy Chowdhury
Sayan Nag
K. J. Joseph
Balaji Vasan Srinivasan
Dinesh Manocha
DiffM
41
7
0
07 Jun 2024
ControlSpeech: Towards Simultaneous Zero-shot Speaker Cloning and
  Zero-shot Language Style Control With Decoupled Codec
ControlSpeech: Towards Simultaneous Zero-shot Speaker Cloning and Zero-shot Language Style Control With Decoupled Codec
Shengpeng Ji
Jia-li Zuo
Minghui Fang
Siqi Zheng
Qian Chen
...
Ziyue Jiang
Hai Huang
Xize Cheng
Rongjie Huang
Zhou Zhao
45
8
0
03 Jun 2024
Towards Black-Box Membership Inference Attack for Diffusion Models
Towards Black-Box Membership Inference Attack for Diffusion Models
Jingwei Li
Jingyi Dong
Tianxing He
Jingzhao Zhang
25
3
0
25 May 2024
AIGB: Generative Auto-bidding via Diffusion Modeling
AIGB: Generative Auto-bidding via Diffusion Modeling
Jiayan Guo
Yusen Huo
Zhilin Zhang
Tianyu Wang
Chuan Yu
Jian Xu
Yan Zhang
Bo Zheng
DiffM
27
1
0
25 May 2024
Converting Anyone's Voice: End-to-End Expressive Voice Conversion with a
  Conditional Diffusion Model
Converting Anyone's Voice: End-to-End Expressive Voice Conversion with a Conditional Diffusion Model
Zongyang Du
Junchen Lu
Kun Zhou
Lakshmish Kaushik
Berrak Sisman
33
1
0
02 May 2024
A Survey on Diffusion Models for Time Series and Spatio-Temporal Data
A Survey on Diffusion Models for Time Series and Spatio-Temporal Data
Yiyuan Yang
Ming Jin
Haomin Wen
Chaoli Zhang
Yuxuan Liang
...
Bin Yang
Zenglin Xu
Jiang Bian
Shirui Pan
Qingsong Wen
DiffM
AI4TS
SyDa
29
36
0
29 Apr 2024
CoVoMix: Advancing Zero-Shot Speech Generation for Human-like
  Multi-talker Conversations
CoVoMix: Advancing Zero-Shot Speech Generation for Human-like Multi-talker Conversations
Leying Zhang
Yao Qian
Long Zhou
Shujie Liu
Dongmei Wang
...
Yanmin Qian
Jinyu Li
Lei He
Sheng Zhao
Michael Zeng
26
1
0
10 Apr 2024
RFWave: Multi-band Rectified Flow for Audio Waveform Reconstruction
RFWave: Multi-band Rectified Flow for Audio Waveform Reconstruction
Peng Liu
Dongyang Dai
Zhiyong Wu
21
2
0
08 Mar 2024
HIR-Diff: Unsupervised Hyperspectral Image Restoration Via Improved
  Diffusion Models
HIR-Diff: Unsupervised Hyperspectral Image Restoration Via Improved Diffusion Models
Li Pang
Xiangyu Rui
Long Cui
Hongzhong Wang
Deyu Meng
Xiangyong Cao
DiffM
37
16
0
24 Feb 2024
On the Semantic Latent Space of Diffusion-Based Text-to-Speech Models
On the Semantic Latent Space of Diffusion-Based Text-to-Speech Models
Miri Varshavsky-Hassid
Roy Hirsch
Regev Cohen
Tomer Golany
Daniel Freedman
Ehud Rivlin
26
3
0
19 Feb 2024
Speaking in Wavelet Domain: A Simple and Efficient Approach to Speed up
  Speech Diffusion Model
Speaking in Wavelet Domain: A Simple and Efficient Approach to Speed up Speech Diffusion Model
Xiangyu Zhang
Daijiao Liu
Hexin Liu
Qiquan Zhang
Hanyu Meng
Leibny Paola García
Chng Eng Siong
Lina Yao
DiffM
13
2
0
16 Feb 2024
MobileSpeech: A Fast and High-Fidelity Framework for Mobile Zero-Shot
  Text-to-Speech
MobileSpeech: A Fast and High-Fidelity Framework for Mobile Zero-Shot Text-to-Speech
Shengpeng Ji
Ziyue Jiang
Hanting Wang
Jia-li Zuo
Zhou Zhao
32
9
0
14 Feb 2024
GLA-Grad: A Griffin-Lim Extended Waveform Generation Diffusion Model
GLA-Grad: A Griffin-Lim Extended Waveform Generation Diffusion Model
Haocheng Liu
Teysir Baoueb
Mathieu Fontaine
Jonathan Le Roux
Gaël Richard
29
4
0
09 Feb 2024
Breaking Free: How to Hack Safety Guardrails in Black-Box Diffusion
  Models!
Breaking Free: How to Hack Safety Guardrails in Black-Box Diffusion Models!
Shashank Kotyan
Poyuan Mao
Pin-Yu Chen
Danilo Vasconcellos Vargas
AAML
DiffM
35
0
0
07 Feb 2024
FreGrad: Lightweight and Fast Frequency-aware Diffusion Vocoder
FreGrad: Lightweight and Fast Frequency-aware Diffusion Vocoder
Tan Dat Nguyen
Ji-Hoon Kim
Youngjoon Jang
Jaehun Kim
Joon Son Chung
DiffM
24
5
0
18 Jan 2024
A Good Score Does not Lead to A Good Generative Model
A Good Score Does not Lead to A Good Generative Model
Sixu Li
Shi Chen
Qin Li
DiffM
64
15
0
10 Jan 2024
FurniScene: A Large-scale 3D Room Dataset with Intricate Furnishing
  Scenes
FurniScene: A Large-scale 3D Room Dataset with Intricate Furnishing Scenes
Genghao Zhang
Yuxi Wang
Chuanchen Luo
Shibiao Xu
Zhaoxiang Zhang
Man Zhang
Junran Peng
VGen
3DV
25
4
0
07 Jan 2024
DiffAugment: Diffusion based Long-Tailed Visual Relationship Recognition
DiffAugment: Diffusion based Long-Tailed Visual Relationship Recognition
Parul Gupta
Tuan Nguyen
Abhinav Dhall
Munawar Hayat
Trung Le
Thanh-Toan Do
25
0
0
01 Jan 2024
TransFace: Unit-Based Audio-Visual Speech Synthesizer for Talking Head
  Translation
TransFace: Unit-Based Audio-Visual Speech Synthesizer for Talking Head Translation
Xize Cheng
Rongjie Huang
Linjun Li
Tao Jin
Zehan Wang
Aoxiong Yin
Minglei Li
Xinyu Duan
Changpeng Yang
Zhou Zhao
28
2
0
23 Dec 2023
FaceTalk: Audio-Driven Motion Diffusion for Neural Parametric Head
  Models
FaceTalk: Audio-Driven Motion Diffusion for Neural Parametric Head Models
Shivangi Aneja
Justus Thies
Angela Dai
Matthias Nießner
DiffM
VGen
24
29
0
13 Dec 2023
Efficient Representation of the Activation Space in Deep Neural Networks
Efficient Representation of the Activation Space in Deep Neural Networks
Tanya Akumu
C. Cintas
G. Tadesse
Adebayo Oshingbesan
Skyler Speakman
E. McFowland
AAML
10
0
0
13 Dec 2023
One-Step Diffusion Distillation via Deep Equilibrium Models
One-Step Diffusion Distillation via Deep Equilibrium Models
Zhengyang Geng
Ashwini Pokle
Trevor Killeen
26
28
0
12 Dec 2023
Synthetic Shifts to Initial Seed Vector Exposes the Brittle Nature of
  Latent-Based Diffusion Models
Synthetic Shifts to Initial Seed Vector Exposes the Brittle Nature of Latent-Based Diffusion Models
Poyuan Mao
Shashank Kotyan
Tham Yik Foong
Danilo Vasconcellos Vargas
22
5
0
24 Nov 2023
TransFusion -- A Transparency-Based Diffusion Model for Anomaly
  Detection
TransFusion -- A Transparency-Based Diffusion Model for Anomaly Detection
Matic Fucka
Vitjan Zavrtanik
D. Skočaj
MedIm
DiffM
19
9
0
16 Nov 2023
Diff-HierVC: Diffusion-based Hierarchical Voice Conversion with Robust
  Pitch Generation and Masked Prior for Zero-shot Speaker Adaptation
Diff-HierVC: Diffusion-based Hierarchical Voice Conversion with Robust Pitch Generation and Masked Prior for Zero-shot Speaker Adaptation
Haram Choi
Sang-Hoon Lee
Seong-Whan Lee
DiffM
18
24
0
08 Nov 2023
Entity Embeddings : Perspectives Towards an Omni-Modality Era for Large
  Language Models
Entity Embeddings : Perspectives Towards an Omni-Modality Era for Large Language Models
Eren Unlu
Unver Ciftci
28
0
0
27 Oct 2023
Style Description based Text-to-Speech with Conditional Prosodic Layer
  Normalization based Diffusion GAN
Style Description based Text-to-Speech with Conditional Prosodic Layer Normalization based Diffusion GAN
Neeraj Kumar
Ankur Narang
Brejesh Lall
DiffM
16
0
0
27 Oct 2023
DASpeech: Directed Acyclic Transformer for Fast and High-quality
  Speech-to-Speech Translation
DASpeech: Directed Acyclic Transformer for Fast and High-quality Speech-to-Speech Translation
Qingkai Fang
Yan Zhou
Yangzhou Feng
32
6
0
11 Oct 2023
Generative Spoken Language Model based on continuous word-sized audio
  tokens
Generative Spoken Language Model based on continuous word-sized audio tokens
Robin Algayres
Yossi Adi
Tu Nguyen
Jade Copet
Gabriel Synnaeve
Benoît Sagot
Emmanuel Dupoux
AuLLM
38
12
0
08 Oct 2023
DiffAR: Denoising Diffusion Autoregressive Model for Raw Speech Waveform
  Generation
DiffAR: Denoising Diffusion Autoregressive Model for Raw Speech Waveform Generation
Roi Benita
Michael Elad
Joseph Keshet
DiffM
25
7
0
02 Oct 2023
ConsistencyTTA: Accelerating Diffusion-Based Text-to-Audio Generation
  with Consistency Distillation
ConsistencyTTA: Accelerating Diffusion-Based Text-to-Audio Generation with Consistency Distillation
Yatong Bai
Trung D. Q. Dang
Dung N. Tran
K. Koishida
Somayeh Sojoudi
DiffM
44
22
0
19 Sep 2023
123
Next