ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2410.11439
  4. Cited By
A Simple Approach to Unifying Diffusion-based Conditional Generation
v1v2v3 (latest)

A Simple Approach to Unifying Diffusion-based Conditional Generation

International Conference on Learning Representations (ICLR), 2024
15 October 2024
Ding Wang
Charles Herrmann
Kelvin C.K. Chan
Yinxiao Li
Deqing Sun
Chao Ma
Ming-Hsuan Yang
    DiffMVLM
ArXiv (abs)PDFHTMLGithub

Papers citing "A Simple Approach to Unifying Diffusion-based Conditional Generation"

50 / 53 papers shown
CtrlVDiff: Controllable Video Generation via Unified Multimodal Video Diffusion
CtrlVDiff: Controllable Video Generation via Unified Multimodal Video Diffusion
Dianbing Xi
Jiepeng Wang
Yuanzhi Liang
Xi Qiu
Jialun Liu
...
Yuchi Huo
Rui Wang
H. Huang
Chi Zhang
Xuelong Li
DiffMVGen
282
3
0
26 Nov 2025
More Than Generation: Unifying Generation and Depth Estimation via Text-to-Image Diffusion Models
More Than Generation: Unifying Generation and Depth Estimation via Text-to-Image Diffusion Models
Hongkai Lin
Dingkang Liang
Mingyang Du
Xin Zhou
X. Bai
MoMeMDEVLM
592
1
0
27 Oct 2025
Unlocking the Potential of Diffusion Priors in Blind Face Restoration
Unlocking the Potential of Diffusion Priors in Blind Face Restoration
Yunqi Miao
Zhiyu Qu
Mingqi Gao
Changrui Chen
Jifei Song
Jungong Han
Jiankang Deng
DiffM
154
1
0
12 Aug 2025
JointDiT: Enhancing RGB-Depth Joint Modeling with Diffusion Transformers
JointDiT: Enhancing RGB-Depth Joint Modeling with Diffusion Transformers
Kwon Byung-Ki
Jingdong Sun
Lee Hyoseok
Chong Luo
Tae-Hyun Oh
836
4
0
01 May 2025
JeDi: Joint-Image Diffusion Models for Finetuning-Free Personalized
  Text-to-Image Generation
JeDi: Joint-Image Diffusion Models for Finetuning-Free Personalized Text-to-Image Generation
Yu Zeng
Vishal M. Patel
Haochen Wang
Xun Huang
Ting-Chun Wang
Xuan Li
Yogesh Balaji
DiffM
206
57
0
08 Jul 2024
Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion
Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion
Boyuan Chen
Diego Marti Monso
Yilun Du
Max Simchowitz
Russ Tedrake
Vincent Sitzmann
DiffM
552
417
0
01 Jul 2024
Depth Anything V2
Depth Anything V2
Lihe Yang
Bingyi Kang
Zilong Huang
Zhen Zhao
Xiaohan Li
Jiashi Feng
Hengshuang Zhao
DiffMVLMMDE
560
1,481
0
13 Jun 2024
GeoWizard: Unleashing the Diffusion Priors for 3D Geometry Estimation
  from a Single Image
GeoWizard: Unleashing the Diffusion Priors for 3D Geometry Estimation from a Single ImageEuropean Conference on Computer Vision (ECCV), 2024
Xiao Fu
Wei Yin
Mu Hu
Kaixuan Wang
Yuexin Ma
Ping Tan
Shaojie Shen
Dahua Lin
Xiaoxiao Long
DiffM
418
273
0
18 Mar 2024
Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers
Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers
Tsai-Shien Chen
Aliaksandr Siarohin
Willi Menapace
Ekaterina Deyneka
Hsiang-wei Chao
...
Yuwei Fang
Hsin-Ying Lee
Jian Ren
Ming-Hsuan Yang
Sergey Tulyakov
VGen
492
387
0
29 Feb 2024
Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data
Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data
Lihe Yang
Bingyi Kang
Zilong Huang
Xiaohan Li
Jiashi Feng
Hengshuang Zhao
VLM
892
1,689
0
19 Jan 2024
Zero-Shot Metric Depth with a Field-of-View Conditioned Diffusion Model
Zero-Shot Metric Depth with a Field-of-View Conditioned Diffusion Model
Saurabh Saxena
Junhwa Hur
Charles Herrmann
Deqing Sun
David J. Fleet
DiffM
301
36
0
20 Dec 2023
LooseControl: Lifting ControlNet for Generalized Depth Conditioning
LooseControl: Lifting ControlNet for Generalized Depth ConditioningInternational Conference on Computer Graphics and Interactive Techniques (SIGGRAPH), 2023
Shariq Farooq Bhat
Niloy J. Mitra
Peter Wonka
AI4CEDiffM
288
73
0
05 Dec 2023
Readout Guidance: Learning Control from Diffusion Features
Readout Guidance: Learning Control from Diffusion FeaturesComputer Vision and Pattern Recognition (CVPR), 2023
Grace Luo
Trevor Darrell
Oliver Wang
Dan B. Goldman
Aleksander Holynski
478
50
0
04 Dec 2023
Repurposing Diffusion-Based Image Generators for Monocular Depth
  Estimation
Repurposing Diffusion-Based Image Generators for Monocular Depth EstimationComputer Vision and Pattern Recognition (CVPR), 2023
Bingxin Ke
Anton Obukhov
Shengyu Huang
Nando Metzger
Rodrigo Caye Daudt
Konrad Schindler
VLMMDE
559
370
0
04 Dec 2023
UniGS: Unified Representation for Image Generation and Segmentation
UniGS: Unified Representation for Image Generation and SegmentationComputer Vision and Pattern Recognition (CVPR), 2023
Lu Qi
Lehan Yang
Weidong Guo
Yu-Syuan Xu
Bo Du
Varun Jampani
Ming-Hsuan Yang
309
28
0
04 Dec 2023
HyperHuman: Hyper-Realistic Human Generation with Latent Structural
  Diffusion
HyperHuman: Hyper-Realistic Human Generation with Latent Structural DiffusionInternational Conference on Learning Representations (ICLR), 2023
Xian Liu
Jian Ren
Aliaksandr Siarohin
Ivan Skorokhodov
Yanyu Li
Dahua Lin
Xihui Liu
Ziwei Liu
Sergey Tulyakov
412
79
0
12 Oct 2023
JointNet: Extending Text-to-Image Diffusion for Dense Distribution
  Modeling
JointNet: Extending Text-to-Image Diffusion for Dense Distribution ModelingInternational Conference on Learning Representations (ICLR), 2023
Jingyang Zhang
Shiwei Li
Yuanxun Lu
Tian Fang
David McKinnon
Yanghai Tsin
Long Quan
Yao Yao
267
17
0
10 Oct 2023
IP-Adapter: Text Compatible Image Prompt Adapter for Text-to-Image
  Diffusion Models
IP-Adapter: Text Compatible Image Prompt Adapter for Text-to-Image Diffusion Models
Hu Ye
Jun Zhang
Siyi Liu
Xiao Han
Wei Yang
DiffM
449
1,487
0
13 Aug 2023
AnimateDiff: Animate Your Personalized Text-to-Image Diffusion Models
  without Specific Tuning
AnimateDiff: Animate Your Personalized Text-to-Image Diffusion Models without Specific TuningInternational Conference on Learning Representations (ICLR), 2023
Yuwei Guo
Ceyuan Yang
Anyi Rao
Zhengyang Liang
Yaohui Wang
Yu Qiao
Maneesh Agrawala
Dahua Lin
Bo Dai
VGen
1.1K
1,456
0
10 Jul 2023
The Surprising Effectiveness of Diffusion Models for Optical Flow and
  Monocular Depth Estimation
The Surprising Effectiveness of Diffusion Models for Optical Flow and Monocular Depth EstimationNeural Information Processing Systems (NeurIPS), 2023
Saurabh Saxena
Charles Herrmann
Junhwa Hur
Abhishek Kar
Mohammad Norouzi
Deqing Sun
David J. Fleet
DiffM
431
135
0
02 Jun 2023
StyleDrop: Text-to-Image Generation in Any Style
StyleDrop: Text-to-Image Generation in Any Style
Kihyuk Sohn
Nataniel Ruiz
Kimin Lee
Daniel Castro Chin
Irina Blok
...
Yuanzhen Li
Yuan Hao
Irfan Essa
Michael Rubinstein
Dilip Krishnan
325
225
0
01 Jun 2023
Diffusion Model for Dense Matching
Diffusion Model for Dense MatchingInternational Conference on Learning Representations (ICLR), 2023
Jisu Nam
Gyuseong Lee
Sunwoo Kim
Ines Hyeonsu Kim
Hyoungwon Cho
Seyeong Kim
Seung Wook Kim
DiffM
328
24
0
30 May 2023
Uni-ControlNet: All-in-One Control to Text-to-Image Diffusion Models
Uni-ControlNet: All-in-One Control to Text-to-Image Diffusion ModelsNeural Information Processing Systems (NeurIPS), 2023
Shihao Zhao
Dongdong Chen
Yen-Chun Chen
Jianmin Bao
Shaozhe Hao
Lu Yuan
Kwan-Yee K. Wong
493
435
0
25 May 2023
LDM3D: Latent Diffusion Model for 3D
LDM3D: Latent Diffusion Model for 3D
Gabriela Ben-Melech Stan
Diana Wofk
Scottie Fox
Alex Redden
Will Saxton
...
Estelle Aflalo
Shao-Yen Tseng
Fabio Nonato
Matthias Muller
Vasudev Lal
384
64
0
18 May 2023
Open-Vocabulary Panoptic Segmentation with Text-to-Image Diffusion
  Models
Open-Vocabulary Panoptic Segmentation with Text-to-Image Diffusion ModelsComputer Vision and Pattern Recognition (CVPR), 2023
Jiarui Xu
Sifei Liu
Arash Vahdat
Wonmin Byeon
Xiaolong Wang
Shalini De Mello
VLM
977
432
0
08 Mar 2023
Unleashing Text-to-Image Diffusion Models for Visual Perception
Unleashing Text-to-Image Diffusion Models for Visual PerceptionIEEE International Conference on Computer Vision (ICCV), 2023
Wenliang Zhao
Yongming Rao
Zuyan Liu
Benlin Liu
Jie Zhou
Jiwen Lu
ObjDVLMMDE
1.1K
322
0
03 Mar 2023
ZoeDepth: Zero-shot Transfer by Combining Relative and Metric Depth
ZoeDepth: Zero-shot Transfer by Combining Relative and Metric Depth
S. Bhat
R. Birkl
Diana Wofk
Peter Wonka
Matthias Müller
VLMMDE
696
868
0
23 Feb 2023
Composer: Creative and Controllable Image Synthesis with Composable
  Conditions
Composer: Creative and Controllable Image Synthesis with Composable ConditionsInternational Conference on Machine Learning (ICML), 2023
Lianghua Huang
Di Chen
Yu Liu
Yujun Shen
Deli Zhao
Jingren Zhou
DiffM
543
371
0
20 Feb 2023
T2I-Adapter: Learning Adapters to Dig out More Controllable Ability for
  Text-to-Image Diffusion Models
T2I-Adapter: Learning Adapters to Dig out More Controllable Ability for Text-to-Image Diffusion ModelsAAAI Conference on Artificial Intelligence (AAAI), 2023
Chong Mou
Xintao Wang
Liangbin Xie
Yanze Wu
Shuai Liu
Chen Ma
Ying Shan
Xiaohu Qie
DiffM
656
1,603
0
16 Feb 2023
Adding Conditional Control to Text-to-Image Diffusion Models
Adding Conditional Control to Text-to-Image Diffusion ModelsIEEE International Conference on Computer Vision (ICCV), 2023
Lvmin Zhang
Anyi Rao
Maneesh Agrawala
AI4CE
1.2K
6,666
1
10 Feb 2023
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image
  Encoders and Large Language Models
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language ModelsInternational Conference on Machine Learning (ICML), 2023
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLMMLLM
1.6K
7,784
0
30 Jan 2023
DAG: Depth-Aware Guidance with Denoising Diffusion Probabilistic Models
DAG: Depth-Aware Guidance with Denoising Diffusion Probabilistic Models
Gyeongnyeon Kim
Wooseok Jang
Gyuseong Lee
Susung Hong
Junyoung Seo
Seung Wook Kim
VLMDiffM
344
13
0
17 Dec 2022
Plug-and-Play Diffusion Features for Text-Driven Image-to-Image
  Translation
Plug-and-Play Diffusion Features for Text-Driven Image-to-Image TranslationComputer Vision and Pattern Recognition (CVPR), 2022
Narek Tumanyan
Michal Geyer
Shai Bagon
Tali Dekel
493
1,002
0
22 Nov 2022
DiffEdit: Diffusion-based semantic image editing with mask guidance
DiffEdit: Diffusion-based semantic image editing with mask guidanceInternational Conference on Learning Representations (ICLR), 2022
Guillaume Couairon
Jakob Verbeek
Holger Schwenk
Matthieu Cord
DiffM
606
725
0
20 Oct 2022
Imagen Video: High Definition Video Generation with Diffusion Models
Imagen Video: High Definition Video Generation with Diffusion Models
Jonathan Ho
William Chan
Chitwan Saharia
Jay Whang
Ruiqi Gao
...
Diederik P. Kingma
Ben Poole
Mohammad Norouzi
David J. Fleet
Tim Salimans
VGen
579
1,987
0
05 Oct 2022
Classifier-Free Diffusion Guidance
Classifier-Free Diffusion Guidance
Jonathan Ho
Tim Salimans
FaML
710
5,964
0
26 Jul 2022
Elucidating the Design Space of Diffusion-Based Generative Models
Elucidating the Design Space of Diffusion-Based Generative ModelsNeural Information Processing Systems (NeurIPS), 2022
Tero Karras
M. Aittala
Timo Aila
S. Laine
DiffM
1.1K
3,189
0
01 Jun 2022
Hierarchical Text-Conditional Image Generation with CLIP Latents
Hierarchical Text-Conditional Image Generation with CLIP Latents
Aditya A. Ramesh
Prafulla Dhariwal
Alex Nichol
Casey Chu
Mark Chen
VLMDiffM
1.5K
8,816
0
13 Apr 2022
Video Diffusion Models
Video Diffusion ModelsNeural Information Processing Systems (NeurIPS), 2022
Jonathan Ho
Tim Salimans
Alexey A. Gritsenko
William Chan
Mohammad Norouzi
David J. Fleet
DiffMVGen
1.1K
2,472
0
07 Apr 2022
BLIP: Bootstrapping Language-Image Pre-training for Unified
  Vision-Language Understanding and Generation
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and GenerationInternational Conference on Machine Learning (ICML), 2022
Junnan Li
Dongxu Li
Caiming Xiong
Guosheng Lin
MLLMBDLVLMCLIP
1.5K
6,390
0
28 Jan 2022
High-Resolution Image Synthesis with Latent Diffusion Models
High-Resolution Image Synthesis with Latent Diffusion ModelsComputer Vision and Pattern Recognition (CVPR), 2021
Robin Rombach
A. Blattmann
Dominik Lorenz
Patrick Esser
Bjorn Ommer
DiffM
4.8K
23,580
0
20 Dec 2021
SDEdit: Guided Image Synthesis and Editing with Stochastic Differential
  Equations
SDEdit: Guided Image Synthesis and Editing with Stochastic Differential Equations
Chenlin Meng
Yutong He
Yang Song
Jiaming Song
Jiajun Wu
Jun-Yan Zhu
Stefano Ermon
DiffM
866
2,070
0
02 Aug 2021
Vision Transformers for Dense Prediction
Vision Transformers for Dense PredictionIEEE International Conference on Computer Vision (ICCV), 2021
René Ranftl
Alexey Bochkovskiy
V. Koltun
ViTMDE
625
2,617
0
24 Mar 2021
Score-Based Generative Modeling through Stochastic Differential
  Equations
Score-Based Generative Modeling through Stochastic Differential EquationsInternational Conference on Learning Representations (ICLR), 2020
Yang Song
Jascha Narain Sohl-Dickstein
Diederik P. Kingma
Abhishek Kumar
Stefano Ermon
Ben Poole
DiffMSyDa
3.7K
10,034
0
26 Nov 2020
Denoising Diffusion Implicit Models
Denoising Diffusion Implicit ModelsInternational Conference on Learning Representations (ICLR), 2020
Jiaming Song
Chenlin Meng
Stefano Ermon
VLMDiffM
1.9K
11,480
0
06 Oct 2020
Denoising Diffusion Probabilistic Models
Denoising Diffusion Probabilistic Models
Jonathan Ho
Ajay Jain
Pieter Abbeel
DiffM
6.2K
29,328
0
19 Jun 2020
Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-shot
  Cross-dataset Transfer
Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-shot Cross-dataset TransferIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2019
René Ranftl
Katrin Lasinger
David Hafner
Konrad Schindler
V. Koltun
MDE
879
2,439
0
02 Jul 2019
OpenPose: Realtime Multi-Person 2D Pose Estimation using Part Affinity
  Fields
OpenPose: Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields
Zhe Cao
Gines Hidalgo
Shunsuke Saito
S. Wei
Yaser Sheikh
3DHCVBM
1.7K
5,457
0
18 Dec 2018
ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes
ScanNet: Richly-annotated 3D Reconstructions of Indoor ScenesComputer Vision and Pattern Recognition (CVPR), 2017
Angela Dai
Angel X. Chang
Manolis Savva
Maciej Halber
Thomas Funkhouser
Matthias Nießner
3DPC3DV
1.6K
5,325
0
14 Feb 2017
U-Net: Convolutional Networks for Biomedical Image Segmentation
U-Net: Convolutional Networks for Biomedical Image Segmentation
Olaf Ronneberger
Philipp Fischer
Thomas Brox
SSeg3DV
3.9K
93,751
0
18 May 2015
12
Next
Page 1 of 2