Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2410.11439
Cited By
v1
v2
v3 (latest)
A Simple Approach to Unifying Diffusion-based Conditional Generation
International Conference on Learning Representations (ICLR), 2024
15 October 2024
Ding Wang
Charles Herrmann
Kelvin C.K. Chan
Yinxiao Li
Deqing Sun
Chao Ma
Ming-Hsuan Yang
DiffM
VLM
Re-assign community
ArXiv (abs)
PDF
HTML
Github
Papers citing
"A Simple Approach to Unifying Diffusion-based Conditional Generation"
50 / 53 papers shown
CtrlVDiff: Controllable Video Generation via Unified Multimodal Video Diffusion
Dianbing Xi
Jiepeng Wang
Yuanzhi Liang
Xi Qiu
Jialun Liu
...
Yuchi Huo
Rui Wang
H. Huang
Chi Zhang
Xuelong Li
DiffM
VGen
282
3
0
26 Nov 2025
More Than Generation: Unifying Generation and Depth Estimation via Text-to-Image Diffusion Models
Hongkai Lin
Dingkang Liang
Mingyang Du
Xin Zhou
X. Bai
MoMe
MDE
VLM
592
1
0
27 Oct 2025
Unlocking the Potential of Diffusion Priors in Blind Face Restoration
Yunqi Miao
Zhiyu Qu
Mingqi Gao
Changrui Chen
Jifei Song
Jungong Han
Jiankang Deng
DiffM
154
1
0
12 Aug 2025
JointDiT: Enhancing RGB-Depth Joint Modeling with Diffusion Transformers
Kwon Byung-Ki
Jingdong Sun
Lee Hyoseok
Chong Luo
Tae-Hyun Oh
836
4
0
01 May 2025
JeDi: Joint-Image Diffusion Models for Finetuning-Free Personalized Text-to-Image Generation
Yu Zeng
Vishal M. Patel
Haochen Wang
Xun Huang
Ting-Chun Wang
Xuan Li
Yogesh Balaji
DiffM
206
57
0
08 Jul 2024
Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion
Boyuan Chen
Diego Marti Monso
Yilun Du
Max Simchowitz
Russ Tedrake
Vincent Sitzmann
DiffM
552
417
0
01 Jul 2024
Depth Anything V2
Lihe Yang
Bingyi Kang
Zilong Huang
Zhen Zhao
Xiaohan Li
Jiashi Feng
Hengshuang Zhao
DiffM
VLM
MDE
560
1,481
0
13 Jun 2024
GeoWizard: Unleashing the Diffusion Priors for 3D Geometry Estimation from a Single Image
European Conference on Computer Vision (ECCV), 2024
Xiao Fu
Wei Yin
Mu Hu
Kaixuan Wang
Yuexin Ma
Ping Tan
Shaojie Shen
Dahua Lin
Xiaoxiao Long
DiffM
418
273
0
18 Mar 2024
Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers
Tsai-Shien Chen
Aliaksandr Siarohin
Willi Menapace
Ekaterina Deyneka
Hsiang-wei Chao
...
Yuwei Fang
Hsin-Ying Lee
Jian Ren
Ming-Hsuan Yang
Sergey Tulyakov
VGen
492
387
0
29 Feb 2024
Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data
Lihe Yang
Bingyi Kang
Zilong Huang
Xiaohan Li
Jiashi Feng
Hengshuang Zhao
VLM
892
1,689
0
19 Jan 2024
Zero-Shot Metric Depth with a Field-of-View Conditioned Diffusion Model
Saurabh Saxena
Junhwa Hur
Charles Herrmann
Deqing Sun
David J. Fleet
DiffM
301
36
0
20 Dec 2023
LooseControl: Lifting ControlNet for Generalized Depth Conditioning
International Conference on Computer Graphics and Interactive Techniques (SIGGRAPH), 2023
Shariq Farooq Bhat
Niloy J. Mitra
Peter Wonka
AI4CE
DiffM
288
73
0
05 Dec 2023
Readout Guidance: Learning Control from Diffusion Features
Computer Vision and Pattern Recognition (CVPR), 2023
Grace Luo
Trevor Darrell
Oliver Wang
Dan B. Goldman
Aleksander Holynski
478
50
0
04 Dec 2023
Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation
Computer Vision and Pattern Recognition (CVPR), 2023
Bingxin Ke
Anton Obukhov
Shengyu Huang
Nando Metzger
Rodrigo Caye Daudt
Konrad Schindler
VLM
MDE
559
370
0
04 Dec 2023
UniGS: Unified Representation for Image Generation and Segmentation
Computer Vision and Pattern Recognition (CVPR), 2023
Lu Qi
Lehan Yang
Weidong Guo
Yu-Syuan Xu
Bo Du
Varun Jampani
Ming-Hsuan Yang
309
28
0
04 Dec 2023
HyperHuman: Hyper-Realistic Human Generation with Latent Structural Diffusion
International Conference on Learning Representations (ICLR), 2023
Xian Liu
Jian Ren
Aliaksandr Siarohin
Ivan Skorokhodov
Yanyu Li
Dahua Lin
Xihui Liu
Ziwei Liu
Sergey Tulyakov
412
79
0
12 Oct 2023
JointNet: Extending Text-to-Image Diffusion for Dense Distribution Modeling
International Conference on Learning Representations (ICLR), 2023
Jingyang Zhang
Shiwei Li
Yuanxun Lu
Tian Fang
David McKinnon
Yanghai Tsin
Long Quan
Yao Yao
267
17
0
10 Oct 2023
IP-Adapter: Text Compatible Image Prompt Adapter for Text-to-Image Diffusion Models
Hu Ye
Jun Zhang
Siyi Liu
Xiao Han
Wei Yang
DiffM
449
1,487
0
13 Aug 2023
AnimateDiff: Animate Your Personalized Text-to-Image Diffusion Models without Specific Tuning
International Conference on Learning Representations (ICLR), 2023
Yuwei Guo
Ceyuan Yang
Anyi Rao
Zhengyang Liang
Yaohui Wang
Yu Qiao
Maneesh Agrawala
Dahua Lin
Bo Dai
VGen
1.1K
1,456
0
10 Jul 2023
The Surprising Effectiveness of Diffusion Models for Optical Flow and Monocular Depth Estimation
Neural Information Processing Systems (NeurIPS), 2023
Saurabh Saxena
Charles Herrmann
Junhwa Hur
Abhishek Kar
Mohammad Norouzi
Deqing Sun
David J. Fleet
DiffM
431
135
0
02 Jun 2023
StyleDrop: Text-to-Image Generation in Any Style
Kihyuk Sohn
Nataniel Ruiz
Kimin Lee
Daniel Castro Chin
Irina Blok
...
Yuanzhen Li
Yuan Hao
Irfan Essa
Michael Rubinstein
Dilip Krishnan
325
225
0
01 Jun 2023
Diffusion Model for Dense Matching
International Conference on Learning Representations (ICLR), 2023
Jisu Nam
Gyuseong Lee
Sunwoo Kim
Ines Hyeonsu Kim
Hyoungwon Cho
Seyeong Kim
Seung Wook Kim
DiffM
328
24
0
30 May 2023
Uni-ControlNet: All-in-One Control to Text-to-Image Diffusion Models
Neural Information Processing Systems (NeurIPS), 2023
Shihao Zhao
Dongdong Chen
Yen-Chun Chen
Jianmin Bao
Shaozhe Hao
Lu Yuan
Kwan-Yee K. Wong
493
435
0
25 May 2023
LDM3D: Latent Diffusion Model for 3D
Gabriela Ben-Melech Stan
Diana Wofk
Scottie Fox
Alex Redden
Will Saxton
...
Estelle Aflalo
Shao-Yen Tseng
Fabio Nonato
Matthias Muller
Vasudev Lal
384
64
0
18 May 2023
Open-Vocabulary Panoptic Segmentation with Text-to-Image Diffusion Models
Computer Vision and Pattern Recognition (CVPR), 2023
Jiarui Xu
Sifei Liu
Arash Vahdat
Wonmin Byeon
Xiaolong Wang
Shalini De Mello
VLM
977
432
0
08 Mar 2023
Unleashing Text-to-Image Diffusion Models for Visual Perception
IEEE International Conference on Computer Vision (ICCV), 2023
Wenliang Zhao
Yongming Rao
Zuyan Liu
Benlin Liu
Jie Zhou
Jiwen Lu
ObjD
VLM
MDE
1.1K
322
0
03 Mar 2023
ZoeDepth: Zero-shot Transfer by Combining Relative and Metric Depth
S. Bhat
R. Birkl
Diana Wofk
Peter Wonka
Matthias Müller
VLM
MDE
696
868
0
23 Feb 2023
Composer: Creative and Controllable Image Synthesis with Composable Conditions
International Conference on Machine Learning (ICML), 2023
Lianghua Huang
Di Chen
Yu Liu
Yujun Shen
Deli Zhao
Jingren Zhou
DiffM
543
371
0
20 Feb 2023
T2I-Adapter: Learning Adapters to Dig out More Controllable Ability for Text-to-Image Diffusion Models
AAAI Conference on Artificial Intelligence (AAAI), 2023
Chong Mou
Xintao Wang
Liangbin Xie
Yanze Wu
Shuai Liu
Chen Ma
Ying Shan
Xiaohu Qie
DiffM
656
1,603
0
16 Feb 2023
Adding Conditional Control to Text-to-Image Diffusion Models
IEEE International Conference on Computer Vision (ICCV), 2023
Lvmin Zhang
Anyi Rao
Maneesh Agrawala
AI4CE
1.2K
6,666
1
10 Feb 2023
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
International Conference on Machine Learning (ICML), 2023
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
1.6K
7,784
0
30 Jan 2023
DAG: Depth-Aware Guidance with Denoising Diffusion Probabilistic Models
Gyeongnyeon Kim
Wooseok Jang
Gyuseong Lee
Susung Hong
Junyoung Seo
Seung Wook Kim
VLM
DiffM
344
13
0
17 Dec 2022
Plug-and-Play Diffusion Features for Text-Driven Image-to-Image Translation
Computer Vision and Pattern Recognition (CVPR), 2022
Narek Tumanyan
Michal Geyer
Shai Bagon
Tali Dekel
493
1,002
0
22 Nov 2022
DiffEdit: Diffusion-based semantic image editing with mask guidance
International Conference on Learning Representations (ICLR), 2022
Guillaume Couairon
Jakob Verbeek
Holger Schwenk
Matthieu Cord
DiffM
606
725
0
20 Oct 2022
Imagen Video: High Definition Video Generation with Diffusion Models
Jonathan Ho
William Chan
Chitwan Saharia
Jay Whang
Ruiqi Gao
...
Diederik P. Kingma
Ben Poole
Mohammad Norouzi
David J. Fleet
Tim Salimans
VGen
579
1,987
0
05 Oct 2022
Classifier-Free Diffusion Guidance
Jonathan Ho
Tim Salimans
FaML
710
5,964
0
26 Jul 2022
Elucidating the Design Space of Diffusion-Based Generative Models
Neural Information Processing Systems (NeurIPS), 2022
Tero Karras
M. Aittala
Timo Aila
S. Laine
DiffM
1.1K
3,189
0
01 Jun 2022
Hierarchical Text-Conditional Image Generation with CLIP Latents
Aditya A. Ramesh
Prafulla Dhariwal
Alex Nichol
Casey Chu
Mark Chen
VLM
DiffM
1.5K
8,816
0
13 Apr 2022
Video Diffusion Models
Neural Information Processing Systems (NeurIPS), 2022
Jonathan Ho
Tim Salimans
Alexey A. Gritsenko
William Chan
Mohammad Norouzi
David J. Fleet
DiffM
VGen
1.1K
2,472
0
07 Apr 2022
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
International Conference on Machine Learning (ICML), 2022
Junnan Li
Dongxu Li
Caiming Xiong
Guosheng Lin
MLLM
BDL
VLM
CLIP
1.5K
6,390
0
28 Jan 2022
High-Resolution Image Synthesis with Latent Diffusion Models
Computer Vision and Pattern Recognition (CVPR), 2021
Robin Rombach
A. Blattmann
Dominik Lorenz
Patrick Esser
Bjorn Ommer
DiffM
4.8K
23,580
0
20 Dec 2021
SDEdit: Guided Image Synthesis and Editing with Stochastic Differential Equations
Chenlin Meng
Yutong He
Yang Song
Jiaming Song
Jiajun Wu
Jun-Yan Zhu
Stefano Ermon
DiffM
866
2,070
0
02 Aug 2021
Vision Transformers for Dense Prediction
IEEE International Conference on Computer Vision (ICCV), 2021
René Ranftl
Alexey Bochkovskiy
V. Koltun
ViT
MDE
625
2,617
0
24 Mar 2021
Score-Based Generative Modeling through Stochastic Differential Equations
International Conference on Learning Representations (ICLR), 2020
Yang Song
Jascha Narain Sohl-Dickstein
Diederik P. Kingma
Abhishek Kumar
Stefano Ermon
Ben Poole
DiffM
SyDa
3.7K
10,034
0
26 Nov 2020
Denoising Diffusion Implicit Models
International Conference on Learning Representations (ICLR), 2020
Jiaming Song
Chenlin Meng
Stefano Ermon
VLM
DiffM
1.9K
11,480
0
06 Oct 2020
Denoising Diffusion Probabilistic Models
Jonathan Ho
Ajay Jain
Pieter Abbeel
DiffM
6.2K
29,328
0
19 Jun 2020
Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-shot Cross-dataset Transfer
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2019
René Ranftl
Katrin Lasinger
David Hafner
Konrad Schindler
V. Koltun
MDE
879
2,439
0
02 Jul 2019
OpenPose: Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields
Zhe Cao
Gines Hidalgo
Shunsuke Saito
S. Wei
Yaser Sheikh
3DH
CVBM
1.7K
5,457
0
18 Dec 2018
ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes
Computer Vision and Pattern Recognition (CVPR), 2017
Angela Dai
Angel X. Chang
Manolis Savva
Maciej Halber
Thomas Funkhouser
Matthias Nießner
3DPC
3DV
1.6K
5,325
0
14 Feb 2017
U-Net: Convolutional Networks for Biomedical Image Segmentation
Olaf Ronneberger
Philipp Fischer
Thomas Brox
SSeg
3DV
3.9K
93,751
0
18 May 2015
1
2
Next
Page 1 of 2