Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2403.04279
Cited By
Controllable Generation with Text-to-Image Diffusion Models: A Survey
7 March 2024
Pu Cao
Feng Zhou
Qing-Huang Song
Lu Yang
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Controllable Generation with Text-to-Image Diffusion Models: A Survey"
30 / 30 papers shown
Title
Unified Multimodal Understanding and Generation Models: Advances, Challenges, and Opportunities
X. Zhang
Jintao Guo
Shanshan Zhao
Minghao Fu
Lunhao Duan
Guo-Hua Wang
Qing-Guo Chen
Zhao Xu
Weihua Luo
Kaifu Zhang
DiffM
11
0
0
05 May 2025
Lifting by Image -- Leveraging Image Cues for Accurate 3D Human Pose Estimation
Feng Zhou
Jianqin Yin
Peiyang Li
3DH
22
2
0
25 Dec 2023
DreamDistribution: Learning Prompt Distribution for Diverse In-distribution Generation
Brian Nlong Zhao
Yuhang Xiao
Jiashu Xu
Xinyang Jiang
Yifan Yang
Dongsheng Li
Laurent Itti
Vibhav Vineet
Yunhao Ge
VLM
74
3
0
21 Dec 2023
CCM: Adding Conditional Controls to Text-to-Image Consistency Models
Jie Xiao
Kai Zhu
Han Zhang
Zhiheng Liu
Yujun Shen
Yu Liu
Xueyang Fu
Zheng-Jun Zha
DiffM
23
7
0
12 Dec 2023
ControlNet-XS: Designing an Efficient and Effective Architecture for Controlling Text-to-Image Diffusion Models
Denis Zavadski
Johann-Friedrich Feiden
Carsten Rother
DiffM
24
10
0
11 Dec 2023
UDiffText: A Unified Framework for High-quality Text Synthesis in Arbitrary Images via Character-aware Diffusion Models
Yiming Zhao
Zhouhui Lian
39
9
0
08 Dec 2023
Customizing Motion in Text-to-Video Diffusion Models
Joanna Materzyñska
Josef Sivic
Eli Shechtman
Antonio Torralba
Richard Zhang
Bryan C. Russell
VGen
DiffM
73
9
0
07 Dec 2023
SAVE: Protagonist Diversification with Structure Agnostic Video Editing
Yeji Song
Wonsik Shin
Junsoo Lee
Jeesoo Kim
Nojun Kwak
DiffM
VGen
70
3
0
05 Dec 2023
StyleCrafter: Enhancing Stylized Text-to-Video Generation with Style Adapter
Gongye Liu
Menghan Xia
Yong Zhang
Haoxin Chen
Jinbo Xing
Xintao Wang
Yujiu Yang
Ying Shan
DiffM
VGen
91
6
0
01 Dec 2023
Layered Rendering Diffusion Model for Zero-Shot Guided Image Synthesis
Zipeng Qi
Guoxi Huang
Zebin Huang
Qin Guo
Jinwen Chen
...
Jian Wang
Gang Zhang
Lufei Liu
Errui Ding
Jingdong Wang
DiffM
34
2
0
30 Nov 2023
An Image is Worth Multiple Words: Multi-attribute Inversion for Constrained Text-to-Image Synthesis
Aishwarya Agarwal
Srikrishna Karanam
Tripti Shukla
Balaji Vasan Srinivasan
DiffM
70
8
0
20 Nov 2023
LAMP: Learn A Motion Pattern for Few-Shot-Based Video Generation
Ruiqi Wu
Liangyu Chen
Tong Yang
Chunle Guo
Chongyi Li
Xiangyu Zhang
DiffM
VGen
55
29
0
16 Oct 2023
Key-Locked Rank One Editing for Text-to-Image Personalization
Yoad Tewel
Rinon Gal
Gal Chechik
Y. Atzmon
DiffM
87
163
0
02 May 2023
In-Context Learning Unlocked for Diffusion Models
Zhendong Wang
Yifan Jiang
Yadong Lu
Yelong Shen
Pengcheng He
Weizhu Chen
Zhangyang Wang
Mingyuan Zhou
VLM
DiffM
54
47
0
01 May 2023
Continual Diffusion: Continual Customization of Text-to-Image Diffusion with C-LoRA
James Smith
Yen-Chang Hsu
Lingyu Zhang
Ting Hua
Z. Kira
Yilin Shen
Hongxia Jin
DiffM
97
62
0
12 Apr 2023
InstantBooth: Personalized Text-to-Image Generation without Test-Time Finetuning
Jing Shi
Wei Xiong
Zhe-nan Lin
H. J. Jung
DiffM
95
172
0
06 Apr 2023
MindDiffuser: Controlled Image Reconstruction from Human Brain Activity with Semantic and Structural Diffusion
Yizhuo Lu
Changde Du
Dianpeng Wang
Huiguang He
DiffM
85
29
0
24 Mar 2023
Natural scene reconstruction from fMRI signals using generative latent diffusion
Furkan Ozcelik
Rufin VanRullen
DiffM
49
42
0
09 Mar 2023
Cones: Concept Neurons in Diffusion Models for Customized Generation
Zhiheng Liu
Ruili Feng
Kai Zhu
Yifei Zhang
Kecheng Zheng
Yu Liu
Deli Zhao
Jingren Zhou
Yang Cao
DiffM
59
73
0
09 Mar 2023
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
207
1,899
0
30 Jan 2023
Visual Prompt Tuning for Generative Transfer Learning
Kihyuk Sohn
Yuan Hao
José Lezama
Luisa F. Polanía
Huiwen Chang
Han Zhang
Irfan Essa
Lu Jiang
VPVLM
VLM
16
58
0
03 Oct 2022
Mind Reader: Reconstructing complex images from brain activities
Sikun Lin
Thomas C. Sprague
Ambuj K. Singh
DiffM
70
70
0
30 Sep 2022
Re-Imagen: Retrieval-Augmented Text-to-Image Generator
Wenhu Chen
Hexiang Hu
Chitwan Saharia
William W. Cohen
VLM
83
121
0
29 Sep 2022
Diffusion Models in Vision: A Survey
Florinel-Alin Croitoru
Vlad Hondru
Radu Tudor Ionescu
M. Shah
DiffM
VLM
MedIm
159
697
0
10 Sep 2022
Diffusion Models: A Comprehensive Survey of Methods and Applications
Ling Yang
Zhilong Zhang
Yingxia Shao
Shenda Hong
Runsheng Xu
Yue Zhao
Wentao Zhang
Bin Cui
Ming-Hsuan Yang
DiffM
MedIm
185
800
0
02 Sep 2022
WIT: Wikipedia-based Image Text Dataset for Multimodal Multilingual Machine Learning
Krishna Srinivasan
K. Raman
Jiecao Chen
Michael Bendersky
Marc Najork
VLM
160
239
0
02 Mar 2021
Zero-Shot Text-to-Image Generation
Aditya A. Ramesh
Mikhail Pavlov
Gabriel Goh
Scott Gray
Chelsea Voss
Alec Radford
Mark Chen
Ilya Sutskever
VLM
237
3,790
0
24 Feb 2021
Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual Concepts
Soravit Changpinyo
P. Sharma
Nan Ding
Radu Soricut
VLM
251
845
0
17 Feb 2021
A Style-Based Generator Architecture for Generative Adversarial Networks
Tero Karras
S. Laine
Timo Aila
243
8,946
0
12 Dec 2018
Parsing R-CNN for Instance-Level Human Analysis
Lu Yang
Q. Song
Zhihui Wang
Ming Jiang
SSeg
27
106
0
30 Nov 2018
1