ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2009.09761
  4. Cited By
DiffWave: A Versatile Diffusion Model for Audio Synthesis
v1v2v3 (latest)

DiffWave: A Versatile Diffusion Model for Audio Synthesis

International Conference on Learning Representations (ICLR), 2020
21 September 2020
Zhifeng Kong
Ming-Yu Liu
Jiaji Huang
Kexin Zhao
Bryan Catanzaro
    DiffMBDL
ArXiv (abs)PDFHTML

Papers citing "DiffWave: A Versatile Diffusion Model for Audio Synthesis"

50 / 1,133 papers shown
Title
MM-Diffusion: Learning Multi-Modal Diffusion Models for Joint Audio and
  Video Generation
MM-Diffusion: Learning Multi-Modal Diffusion Models for Joint Audio and Video GenerationComputer Vision and Pattern Recognition (CVPR), 2022
Ludan Ruan
Yi Ma
Huan Yang
Huiguo He
Bei Liu
Jianlong Fu
Nicholas Jing Yuan
Qin Jin
B. Guo
DiffMVGen
378
242
0
19 Dec 2022
Latent Diffusion for Language Generation
Latent Diffusion for Language GenerationNeural Information Processing Systems (NeurIPS), 2022
Justin Lovelace
Varsha Kishore
Chao-gang Wan
Eliot Shekhtman
Kilian Q. Weinberger
DiffM
231
108
0
19 Dec 2022
Empowering Diffusion Models on the Embedding Space for Text Generation
Empowering Diffusion Models on the Embedding Space for Text GenerationNorth American Chapter of the Association for Computational Linguistics (NAACL), 2022
Zhujin Gao
Junliang Guo
Xuejiao Tan
Yongxin Zhu
Fang Zhang
Jiang Bian
Linli Xu
DiffM
211
33
0
19 Dec 2022
Uncovering the Disentanglement Capability in Text-to-Image Diffusion
  Models
Uncovering the Disentanglement Capability in Text-to-Image Diffusion ModelsComputer Vision and Pattern Recognition (CVPR), 2022
Qiucheng Wu
Yujian Liu
Handong Zhao
Ajinkya Kale
T. Bui
Tong Yu
Zhe Lin
Yang Zhang
Shiyu Chang
DiffMCoGe
239
122
0
16 Dec 2022
How to Backdoor Diffusion Models?
How to Backdoor Diffusion Models?Computer Vision and Pattern Recognition (CVPR), 2022
Sheng-Yen Chou
Pin-Yu Chen
Tsung-Yi Ho
DiffMSILM
405
114
0
11 Dec 2022
Framewise WaveGAN: High Speed Adversarial Vocoder in Time Domain with
  Very Low Computational Complexity
Framewise WaveGAN: High Speed Adversarial Vocoder in Time Domain with Very Low Computational ComplexityIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Ahmed Mustafa
J. Valin
Jan Büthe
Paris Smaragdis
Mike Goodwin
123
6
0
08 Dec 2022
MoFusion: A Framework for Denoising-Diffusion-based Motion Synthesis
MoFusion: A Framework for Denoising-Diffusion-based Motion SynthesisComputer Vision and Pattern Recognition (CVPR), 2022
Rishabh Dabral
Muhammad Hamza Mughal
Vladislav Golyanik
Christian Theobalt
DiffMVGen
269
228
0
08 Dec 2022
Talking Head Generation with Probabilistic Audio-to-Visual Diffusion
  Priors
Talking Head Generation with Probabilistic Audio-to-Visual Diffusion PriorsIEEE International Conference on Computer Vision (ICCV), 2022
Zhentao Yu
Zixin Yin
Deyu Zhou
Duomin Wang
Finn Wong
Baoyuan Wang
DiffM
187
54
0
07 Dec 2022
Diffusion-SDF: Text-to-Shape via Voxelized Diffusion
Diffusion-SDF: Text-to-Shape via Voxelized DiffusionComputer Vision and Pattern Recognition (CVPR), 2022
Muheng Li
Yueqi Duan
Jie Zhou
Jiwen Lu
DiffM
267
149
0
06 Dec 2022
Denoising diffusion probabilistic models for probabilistic energy
  forecasting
Denoising diffusion probabilistic models for probabilistic energy forecasting
Esteban Hernandez Capel
Jonathan Dumas
DiffM
272
22
0
06 Dec 2022
DiffusionInst: Diffusion Model for Instance Segmentation
DiffusionInst: Diffusion Model for Instance SegmentationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Zhangxuan Gu
Haoxing Chen
Zhuoer Xu
Jun Lan
Changhua Meng
Weiqiang Wang
DiffM
247
108
0
06 Dec 2022
PhysDiff: Physics-Guided Human Motion Diffusion Model
PhysDiff: Physics-Guided Human Motion Diffusion ModelIEEE International Conference on Computer Vision (ICCV), 2022
Ye Yuan
Jiaming Song
Umar Iqbal
Arash Vahdat
Jan Kautz
VGenDiffM
567
351
0
05 Dec 2022
Diffusion Generative Models in Infinite Dimensions
Diffusion Generative Models in Infinite DimensionsInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2022
Gavin Kerrigan
Justin Ley
Padhraic Smyth
DiffM
336
43
0
01 Dec 2022
3D-LDM: Neural Implicit 3D Shape Generation with Latent Diffusion Models
3D-LDM: Neural Implicit 3D Shape Generation with Latent Diffusion Models
Gimin Nam
Mariem Khlifi
Andrew Rodriguez
Alberto Tono
Linqi Zhou
Paul Guerrero
DiffM
245
78
0
01 Dec 2022
Denoising Diffusion for Sampling SAT Solutions
Denoising Diffusion for Sampling SAT Solutions
Kārlis Freivalds
Sergejs Kozlovics
117
3
0
30 Nov 2022
DiffusionBERT: Improving Generative Masked Language Models with
  Diffusion Models
DiffusionBERT: Improving Generative Masked Language Models with Diffusion ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2022
Zhengfu He
Tianxiang Sun
Kuan-Chieh Wang
Xuanjing Huang
Xipeng Qiu
DiffMVLM
203
201
0
28 Nov 2022
Fast Sampling of Diffusion Models via Operator Learning
Fast Sampling of Diffusion Models via Operator LearningInternational Conference on Machine Learning (ICML), 2022
Hongkai Zheng
Weili Nie
Arash Vahdat
Kamyar Azizzadenesheli
Anima Anandkumar
DiffM
369
180
0
24 Nov 2022
Peekaboo: Text to Image Diffusion Models are Zero-Shot Segmentors
Peekaboo: Text to Image Diffusion Models are Zero-Shot Segmentors
R. Burgert
Kanchana Ranasinghe
Xiang Li
Michael S. Ryoo
DiffMVLM
258
41
0
23 Nov 2022
Diffusion Denoising Process for Perceptron Bias in Out-of-distribution
  Detection
Diffusion Denoising Process for Perceptron Bias in Out-of-distribution Detection
Luping Liu
Yi Ren
Xize Cheng
Rongjie Huang
Chongxuan Li
Zhou Zhao
149
7
0
21 Nov 2022
Robust Vocal Quality Feature Embeddings for Dysphonic Voice Detection
Robust Vocal Quality Feature Embeddings for Dysphonic Voice DetectionIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2022
Jianwei Zhang
J. Liss
Suren Jayasuriya
Visar Berisha
173
8
0
17 Nov 2022
InstructPix2Pix: Learning to Follow Image Editing Instructions
InstructPix2Pix: Learning to Follow Image Editing InstructionsComputer Vision and Pattern Recognition (CVPR), 2022
Tim Brooks
Aleksander Holynski
Alexei A. Efros
DiffM
544
2,451
0
17 Nov 2022
Listen, Denoise, Action! Audio-Driven Motion Synthesis with Diffusion
  Models
Listen, Denoise, Action! Audio-Driven Motion Synthesis with Diffusion ModelsACM Transactions on Graphics (TOG), 2022
Simon Alexanderson
Rajmund Nagy
Jonas Beskow
G. Henter
DiffMVGen
253
224
0
17 Nov 2022
EmoDiff: Intensity Controllable Emotional Text-to-Speech with Soft-Label
  Guidance
EmoDiff: Intensity Controllable Emotional Text-to-Speech with Soft-Label GuidanceIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Yiwei Guo
Chenpeng Du
Xie Chen
K. Yu
DiffM
215
56
0
17 Nov 2022
Challenges in creative generative models for music: a divergence
  maximization perspective
Challenges in creative generative models for music: a divergence maximization perspective
Axel Chemla-Romeu-Santos
P. Esling
231
4
0
16 Nov 2022
Versatile Diffusion: Text, Images and Variations All in One Diffusion
  Model
Versatile Diffusion: Text, Images and Variations All in One Diffusion ModelIEEE International Conference on Computer Vision (ICCV), 2022
Xingqian Xu
Zinan Lin
Eric Zhang
Kai Wang
Humphrey Shi
DiffM
483
239
0
15 Nov 2022
Diffusion Models for Medical Image Analysis: A Comprehensive Survey
Diffusion Models for Medical Image Analysis: A Comprehensive Survey
Amirhossein Kazerouni
Ehsan Khodapanah Aghdam
Moein Heidari
Reza Azad
Mohsen Fayyaz
Ilker Hacihaliloglu
Dorit Merhof
DiffMMedIm
438
550
0
14 Nov 2022
Arbitrary Style Guidance for Enhanced Diffusion-Based Text-to-Image
  Generation
Arbitrary Style Guidance for Enhanced Diffusion-Based Text-to-Image GenerationIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2022
Zhihong Pan
Xiaoxia Zhou
Hao Tian
DiffM
149
14
0
14 Nov 2022
DriftRec: Adapting diffusion models to blind JPEG restoration
DriftRec: Adapting diffusion models to blind JPEG restorationIEEE Transactions on Image Processing (IEEE TIP), 2022
Simon Welker
H. Chapman
Timo Gerkmann
DiffM
187
25
0
12 Nov 2022
Few-shot Image Generation with Diffusion Models
Few-shot Image Generation with Diffusion Models
Jin Zhu
Huimin Ma
Jiansheng Chen
Jian Yuan
DiffM
284
28
0
07 Nov 2022
Modeling Temporal Data as Continuous Functions with Stochastic Process
  Diffusion
Modeling Temporal Data as Continuous Functions with Stochastic Process DiffusionInternational Conference on Machine Learning (ICML), 2022
Marin Bilos
Kashif Rasul
Anderson Schneider
Yuriy Nevmyvaka
Stephan Günnemann
DiffM
269
48
0
04 Nov 2022
Cold Diffusion for Speech Enhancement
Cold Diffusion for Speech EnhancementIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Hao Yen
François Germain
Gordon Wichern
Jonathan Le Roux
DiffM
297
54
0
04 Nov 2022
NoreSpeech: Knowledge Distillation based Conditional Diffusion Model for
  Noise-robust Expressive TTS
NoreSpeech: Knowledge Distillation based Conditional Diffusion Model for Noise-robust Expressive TTSInterspeech (Interspeech), 2022
Dongchao Yang
Songxiang Liu
Jianwei Yu
Helin Wang
Chao Weng
Yuexian Zou
DiffMVLM
161
22
0
04 Nov 2022
An optimal control perspective on diffusion-based generative modeling
An optimal control perspective on diffusion-based generative modeling
Julius Berner
Lorenz Richter
Karen Ullrich
DiffM
407
125
0
02 Nov 2022
eDiff-I: Text-to-Image Diffusion Models with an Ensemble of Expert
  Denoisers
eDiff-I: Text-to-Image Diffusion Models with an Ensemble of Expert Denoisers
Yogesh Balaji
Seungjun Nah
Xun Huang
Arash Vahdat
Jiaming Song
...
Timo Aila
S. Laine
Bryan Catanzaro
Tero Karras
Xuan Li
VLMMoE
517
974
0
02 Nov 2022
DSPGAN: a GAN-based universal vocoder for high-fidelity TTS by
  time-frequency domain supervision from DSP
DSPGAN: a GAN-based universal vocoder for high-fidelity TTS by time-frequency domain supervision from DSPIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Kun Song
Yongmao Zhang
Yinjiao Lei
Jian Cong
Hanzhao Li
Linfu Xie
Gang He
Jinfeng Bai
150
23
0
02 Nov 2022
Concrete Score Matching: Generalized Score Matching for Discrete Data
Concrete Score Matching: Generalized Score Matching for Discrete DataNeural Information Processing Systems (NeurIPS), 2022
Chenlin Meng
Kristy Choi
Jiaming Song
Stefano Ermon
DiffM
503
105
0
02 Nov 2022
SSD-LM: Semi-autoregressive Simplex-based Diffusion Language Model for
  Text Generation and Modular Control
SSD-LM: Semi-autoregressive Simplex-based Diffusion Language Model for Text Generation and Modular ControlAnnual Meeting of the Association for Computational Linguistics (ACL), 2022
Xiaochuang Han
Sachin Kumar
Yulia Tsvetkov
290
135
0
31 Oct 2022
Guided Conditional Diffusion for Controllable Traffic Simulation
Guided Conditional Diffusion for Controllable Traffic SimulationIEEE International Conference on Robotics and Automation (ICRA), 2022
Ziyuan Zhong
Davis Rempe
Danfei Xu
Yuxiao Chen
Sushant Veer
Tong Che
Baishakhi Ray
Marco Pavone
263
214
0
31 Oct 2022
Robust MelGAN: A robust universal neural vocoder for high-fidelity TTS
Robust MelGAN: A robust universal neural vocoder for high-fidelity TTSInternational Symposium on Chinese Spoken Language Processing (ISCSLP), 2022
Kun Song
Jian Cong
Xinsheng Wang
Yongmao Zhang
Linfu Xie
Ning Jiang
Haiying Wu
128
0
0
31 Oct 2022
Diffusion-based Generative Speech Source Separation
Diffusion-based Generative Speech Source SeparationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Robin Scheibler
Youna Ji
Soo-Whan Chung
J. Byun
Soyeon Choe
Min-Seok Choi
DiffM
342
60
0
31 Oct 2022
SRTNet: Time Domain Speech Enhancement Via Stochastic Refinement
SRTNet: Time Domain Speech Enhancement Via Stochastic RefinementIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Zhibin Qiu
Mengfan Fu
Yinfeng Yu
Lili Yin
Gang Hua
Hao-Ming Huang
DiffM
223
22
0
30 Oct 2022
Conditioning and Sampling in Variational Diffusion Models for Speech
  Super-Resolution
Conditioning and Sampling in Variational Diffusion Models for Speech Super-ResolutionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Chin-Yun Yu
Sung-Lin Yeh
Gyorgy Fazekas
Hao Tang
DiffM
131
31
0
27 Oct 2022
Source-Filter HiFi-GAN: Fast and Pitch Controllable High-Fidelity Neural
  Vocoder
Source-Filter HiFi-GAN: Fast and Pitch Controllable High-Fidelity Neural VocoderIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Reo Yoneyama
Yi-Chiao Wu
Tomoki Toda
213
35
0
27 Oct 2022
Solving Audio Inverse Problems with a Diffusion Model
Solving Audio Inverse Problems with a Diffusion ModelIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Eloi Moliner
J. Lehtinen
Vesa Valimaki
DiffM
323
73
0
27 Oct 2022
Full-band General Audio Synthesis with Score-based Diffusion
Full-band General Audio Synthesis with Score-based DiffusionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Santiago Pascual
Gautam Bhattacharya
Chunghsin Yeh
Jordi Pons
Joan Serrà
DiffM
193
39
0
26 Oct 2022
Structure-based Drug Design with Equivariant Diffusion Models
Structure-based Drug Design with Equivariant Diffusion ModelsNature Computational Science (Nat. Comput. Sci.), 2022
Arne Schneuing
Yuanqi Du
Charles Harris
Arian R. Jamasb
Ilia Igashov
...
Pietro Lio
Daniel Schwalbe-Koda
Max Welling
Michael M. Bronstein
B. Correia
DiffM
332
331
0
24 Oct 2022
Deep Equilibrium Approaches to Diffusion Models
Deep Equilibrium Approaches to Diffusion ModelsNeural Information Processing Systems (NeurIPS), 2022
Ashwini Pokle
Zhengyang Geng
Zico Kolter
DiffM
254
48
0
23 Oct 2022
Diffusion Motion: Generate Text-Guided 3D Human Motion by Diffusion
  Model
Diffusion Motion: Generate Text-Guided 3D Human Motion by Diffusion ModelIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Zhiyuan Ren
Zhihong Pan
Xingfa Zhou
Le Kang
VGenDiffM
268
51
0
22 Oct 2022
Score-based Denoising Diffusion with Non-Isotropic Gaussian Noise Models
Score-based Denoising Diffusion with Non-Isotropic Gaussian Noise Models
Vikram S. Voleti
Christopher Pal
Adam M. Oberman
DiffM
224
23
0
21 Oct 2022
Boomerang: Local sampling on image manifolds using diffusion models
Boomerang: Local sampling on image manifolds using diffusion models
Lorenzo Luzi
P. Mayer
Josue Casco-Rodriguez
Ali Siahkoohi
Richard G. Baraniuk
DiffM
322
21
0
21 Oct 2022
Previous
123...181920212223
Next