Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2307.01952
Cited By
SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis
4 July 2023
Dustin Podell
Zion English
Kyle Lacey
A. Blattmann
Tim Dockhorn
Jonas Muller
Joe Penna
Robin Rombach
Re-assign community
ArXiv
PDF
HTML
Papers citing
"SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis"
50 / 1,613 papers shown
Title
LightLab: Controlling Light Sources in Images with Diffusion Models
Nadav Magar
Amir Hertz
Eric Tabellion
Yael Pritch
Alex Rav Acha
Ariel Shamir
Yedid Hoshen
7
0
0
14 May 2025
Visually Guided Decoding: Gradient-Free Hard Prompt Inversion with Language Models
Donghoon Kim
Minji Bae
Kyuhong Shim
B. Shim
26
0
0
13 May 2025
ShotAdapter: Text-to-Multi-Shot Video Generation with Diffusion Models
Ozgur Kara
Krishna Kumar Singh
Feng Liu
Duygu Ceylan
James M. Rehg
Tobias Hinz
DiffM
VGen
23
0
0
12 May 2025
DanceGRPO: Unleashing GRPO on Visual Generation
Zeyue Xue
Jie Wu
Yu Gao
Fangyuan Kong
Lingting Zhu
...
Zhiheng Liu
Wei Liu
Qiushan Guo
Weilin Huang
Ping Luo
EGVM
VGen
47
0
0
12 May 2025
Uni-AIMS: AI-Powered Microscopy Image Analysis
Yanhui Hong
Nan Wang
Zhiyi Xia
Haoyi Tao
Xi Fang
...
Shengyu Li
Ziqi Chen
Zezhong Zhang
Guolin Ke
Linfeng Zhang
21
0
0
11 May 2025
HDGlyph: A Hierarchical Disentangled Glyph-Based Framework for Long-Tail Text Rendering in Diffusion Models
Shuhan Zhuang
Mengqi Huang
Fengyi Fu
Nan Chen
Bohan Lei
Zhendong Mao
DiffM
20
0
0
10 May 2025
The ML.ENERGY Benchmark: Toward Automated Inference Energy Measurement and Optimization
Jae-Won Chung
Jiachen Liu
Jeff J. Ma
Ruofan Wu
Oh Jun Kweon
Yuxuan Xia
Zhiyu Wu
Mosharaf Chowdhury
21
0
0
09 May 2025
InstanceGen: Image Generation with Instance-level Instructions
Etai Sella
Yanir Kleiman
Hadar Averbuch-Elor
26
0
0
08 May 2025
ViCTr: Vital Consistency Transfer for Pathology Aware Image Synthesis
Onkar Susladkar
Gayatri S Deshmukh
Yalcin Tur
Ulas Bagci
MedIm
51
0
0
08 May 2025
Flow-GRPO: Training Flow Matching Models via Online RL
Jie Liu
Gongye Liu
Jiajun Liang
Y. Li
Jiaheng Liu
X. Wang
Pengfei Wan
Di Zhang
Wanli Ouyang
AI4CE
68
0
0
08 May 2025
Boosting Statistic Learning with Synthetic Data from Pretrained Large Models
Jialong Jiang
Wenkang Hu
Jian Huang
Yuling Jiao
Xu Liu
DiffM
45
0
0
08 May 2025
TokLIP: Marry Visual Tokens to CLIP for Multimodal Comprehension and Generation
Haokun Lin
Teng Wang
Yixiao Ge
Yuying Ge
Zhichao Lu
Ying Wei
Qingfu Zhang
Zhenan Sun
Ying Shan
MLLM
VLM
64
0
0
08 May 2025
Text2CT: Towards 3D CT Volume Generation from Free-text Descriptions Using Diffusion Model
Pengfei Guo
Can Zhao
Dong Yang
Yufan He
V. Nath
...
Zongwei Zhou
Benjamin D. Simon
Stephanie Harmon
B. Turkbey
Daguang Xu
DiffM
MedIm
38
0
0
07 May 2025
Multi-turn Consistent Image Editing
Zijun Zhou
Yingying Deng
Xiangyu He
Weiming Dong
Fan Tang
46
0
0
07 May 2025
Replay to Remember (R2R): An Efficient Uncertainty-driven Unsupervised Continual Learning Framework Using Generative Replay
Sriram Mandalika
Harsha Vardhan
Athira Nambiar
VLM
58
0
0
07 May 2025
CountDiffusion: Text-to-Image Synthesis with Training-Free Counting-Guidance Diffusion
Y. Li
Pencheng Wan
Liang Han
Yaowei Wang
Liqiang Nie
Min Zhang
41
0
0
07 May 2025
PrimitiveAnything: Human-Crafted 3D Primitive Assembly Generation with Auto-Regressive Transformer
Jingwen Ye
Yuze He
Yanning Zhou
Yiqin Zhu
Kaiwen Xiao
Yong-Jin Liu
Wei Yang
Xiao Han
39
0
0
07 May 2025
Generating Synthetic Data via Augmentations for Improved Facial Resemblance in DreamBooth and InstantID
Koray Ulusan
Benjamin Kiefer
DiffM
27
0
0
06 May 2025
Multimodal Benchmarking and Recommendation of Text-to-Image Generation Models
Kapil Wanaskar
Gaytri Jena
Magdalini Eirinaki
EGVM
29
0
0
06 May 2025
SynSHRP2: A Synthetic Multimodal Benchmark for Driving Safety-critical Events Derived from Real-world Driving Data
Liang Shi
Boyu Jiang
Zhenyuan Yuan
Miguel A. Perez
Feng Guo
24
0
0
06 May 2025
Ming-Lite-Uni: Advancements in Unified Architecture for Natural Multimodal Interaction
Biao Gong
Cheng Zou
Dandan Zheng
Hu Yu
Jingdong Chen
...
Qingpei Guo
Rui Liu
Weilong Chai
Xinyu Xiao
Ziyuan Huang
MLLM
74
1
0
05 May 2025
SuperEdit: Rectifying and Facilitating Supervision for Instruction-Based Image Editing
Ming Li
Xin Gu
Fan Chen
X. Xing
Longyin Wen
C. L. P. Chen
Sijie Zhu
DiffM
79
1
0
05 May 2025
MUSAR: Exploring Multi-Subject Customization from Single-Subject Dataset via Attention Routing
Zinan Guo
Pengze Zhang
Yanze Wu
Chong Mou
Songtao Zhao
Qian He
24
0
0
05 May 2025
Unified Multimodal Understanding and Generation Models: Advances, Challenges, and Opportunities
X. Zhang
Jintao Guo
Shanshan Zhao
Minghao Fu
Lunhao Duan
Guo-Hua Wang
Qing-Guo Chen
Zhao Xu
Weihua Luo
Kaifu Zhang
DiffM
60
0
0
05 May 2025
MCCD: Multi-Agent Collaboration-based Compositional Diffusion for Complex Text-to-Image Generation
Mingcheng Li
Xiaolu Hou
Ziyang Liu
Dingkang Yang
Ziyun Qian
Jiawei Chen
Jinjie Wei
Y. Jiang
Qingyao Xu
L. Zhang
DiffM
85
0
0
05 May 2025
Using Knowledge Graphs to harvest datasets for efficient CLIP model training
Simon Ging
Sebastian Walter
Jelena Bratulić
Johannes Dienert
Hannah Bast
Thomas Brox
CLIP
20
0
0
05 May 2025
Improving Physical Object State Representation in Text-to-Image Generative Systems
Tianle Chen
Chaitanya Chakka
Deepti Ghadiyaram
27
0
0
04 May 2025
Rethinking Score Distilling Sampling for 3D Editing and Generation
Xingyu Miao
Haoran Duan
Yang Long
J. Han
39
0
0
03 May 2025
Improving Editability in Image Generation with Layer-wise Memory
Daneul Kim
Jaeah Lee
Jaesik Park
DiffM
KELM
53
0
0
02 May 2025
WorldGenBench: A World-Knowledge-Integrated Benchmark for Reasoning-Driven Text-to-Image Generation
D. Zhang
Che Jiang
Ruoshi Xu
Biaoxiang Chen
Zijian Jin
Yutian Lu
Jianguo Zhang
Liang Yong
Jiebo Luo
Shengda Luo
VLM
45
0
0
02 May 2025
InstructAttribute: Fine-grained Object Attributes editing with Instruction
Xingxi Yin
Jingfeng Zhang
Zhi Li
Y. Li
Y. Zhang
DiffM
109
0
0
01 May 2025
Unlearning Sensitive Information in Multimodal LLMs: Benchmark and Attack-Defense Evaluation
Vaidehi Patil
Yi-Lin Sung
Peter Hase
Jie Peng
Tianlong Chen
Mohit Bansal
AAML
MU
79
3
0
01 May 2025
T2I-R1: Reinforcing Image Generation with Collaborative Semantic-level and Token-level CoT
D. Jiang
Ziyu Guo
Renrui Zhang
Zhuofan Zong
Hao Li
Le Zhuo
Shilin Yan
Pheng-Ann Heng
H. Li
LRM
57
0
0
01 May 2025
Multi-Modal Language Models as Text-to-Image Model Evaluators
Jiahui Chen
Candace Ross
Reyhane Askari Hemmat
Koustuv Sinha
Melissa Hall
M. Drozdzal
Adriana Romero-Soriano
EGVM
60
0
0
01 May 2025
Controllable Weather Synthesis and Removal with Video Diffusion Models
Chih-Hao Lin
Z. Wang
Ruofan Liang
Yuxuan Zhang
Sanja Fidler
Shenlong Wang
Zan Gojcic
DiffM
VGen
42
0
0
01 May 2025
Can We Achieve Efficient Diffusion without Self-Attention? Distilling Self-Attention into Convolutions
Ziyi Dong
Chengxing Zhou
Weijian Deng
Pengxu Wei
Xiangyang Ji
Liang Lin
MQ
43
0
0
30 Apr 2025
AGHI-QA: A Subjective-Aligned Dataset and Metric for AI-Generated Human Images
Yunhao Li
Sijing Wu
Wei Sun
Zhichao Zhang
Yucheng Zhu
Zicheng Zhang
Huiyu Duan
Xiongkuo Min
Guangtao Zhai
EGVM
81
0
0
30 Apr 2025
GarmentDiffusion: 3D Garment Sewing Pattern Generation with Multimodal Diffusion Transformers
Xinyu Li
Qi Yao
Y. Wang
DiffM
41
0
0
30 Apr 2025
Nexus-Gen: A Unified Model for Image Understanding, Generation, and Editing
Hong Zhang
Zhongjie Duan
Xingjun Wang
Yuze Zhao
Weiyi Lu
Zhipeng Di
Y. Xu
Yingda Chen
Yu Zhang
MLLM
90
1
0
30 Apr 2025
T2ID-CAS: Diffusion Model and Class Aware Sampling to Mitigate Class Imbalance in Neck Ultrasound Anatomical Landmark Detection
Manikanta Varaganti
Amulya Vankayalapati
Nour Awad
Gregory R. Dion
Laura J. Brattain
DiffM
MedIm
64
0
0
29 Apr 2025
TrueFake: A Real World Case Dataset of Last Generation Fake Images also Shared on Social Networks
S. Dell’Anna
Andrea Montibeller
Giulia Boato
54
0
0
29 Apr 2025
YoChameleon: Personalized Vision and Language Generation
Thao Nguyen
Krishna Kumar Singh
Jing Shi
Trung H. Bui
Yong Jae Lee
Yuheng Li
MLLM
82
0
0
29 Apr 2025
Image Generation Method Based on Heat Diffusion Models
Pengfei Zhang
Shouqing Jia
DiffM
VLM
42
0
0
28 Apr 2025
ShowMak3r: Compositional TV Show Reconstruction
S. Kim
Seunguk Do
Jaesik Park
VGen
36
0
0
28 Apr 2025
DiVE: Efficient Multi-View Driving Scenes Generation Based on Video Diffusion Transformer
Junpeng Jiang
Gangyi Hong
Miao Zhang
Hengtong Hu
Kun Zhan
Rui Shao
Liqiang Nie
VGen
51
0
0
28 Apr 2025
RepText: Rendering Visual Text via Replicating
H. Wang
Y. Xu
Y. Li
J. Li
Chaowei Zhang
J. Wang
Kejia Yang
Z. Chen
VLM
66
0
0
28 Apr 2025
RoboVerse: Towards a Unified Platform, Dataset and Benchmark for Scalable and Generalizable Robot Learning
Haoran Geng
Feishi Wang
Songlin Wei
Y. Li
Bangjun Wang
...
Hao Dong
Siyuan Huang
Yue Wang
Jitendra Malik
Pieter Abbeel
75
3
0
26 Apr 2025
REED-VAE: RE-Encode Decode Training for Iterative Image Editing with Diffusion Models
Gal Almog
Ariel Shamir
Ohad Fried
DiffM
53
0
0
26 Apr 2025
Text-to-Image Alignment in Denoising-Based Models through Step Selection
P. Grimal
Hervé Le Borgne
Olivier Ferret
DiffM
EGVM
48
0
0
24 Apr 2025
Token-Shuffle: Towards High-Resolution Image Generation with Autoregressive Models
Xu Ma
Peize Sun
Haoyu Ma
Hao Tang
Chih-Yao Ma
...
Matt Feiszli
Peizhao Zhang
Peter Vajda
Sam S. Tsai
Y. Fu
68
1
0
24 Apr 2025
1
2
3
4
...
31
32
33
Next