Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2205.11487
Cited By
Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding
Neural Information Processing Systems (NeurIPS), 2022
23 May 2022
Chitwan Saharia
William Chan
Saurabh Saxena
Lala Li
Jay Whang
Emily L. Denton
Seyed Kamyar Seyed Ghasemipour
Burcu Karagol Ayan
S. S. Mahdavi
Raphael Gontijo-Lopes
Tim Salimans
Jonathan Ho
David J Fleet
Mohammad Norouzi
VLM
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (1 upvotes)
Papers citing
"Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding"
50 / 5,039 papers shown
Prompt-aware classifier free guidance for diffusion models
Xuanhao Zhang
Chang Li
DiffM
VLM
173
0
0
25 Sep 2025
MMG: Mutual Information Estimation via the MMSE Gap in Diffusion
Longxuan Yu
Xing Shi
Xianghao Kong
Tong Jia
Greg Ver Steeg
DiffM
217
0
0
24 Sep 2025
PhysCtrl: Generative Physics for Controllable and Physics-Grounded Video Generation
Chen Wang
Chuhao Chen
Yiming Huang
Zhiyang Dou
Yuan Liu
Jiatao Gu
Lingjie Liu
DiffM
VGen
PINN
619
9
0
24 Sep 2025
Efficient Encoder-Free Pose Conditioning and Pose Control for Virtual Try-On
Qi Li
Shuwen Qiu
Julien Han
Xingzi Xu
M. S. Seyfioglu
Kee Kiat Koo
Karim Bouyarmane
3DH
212
1
0
24 Sep 2025
InstructVTON: Optimal Auto-Masking and Natural-Language-Guided Interactive Style Control for Inpainting-Based Virtual Try-On
Julien Han
Shuwen Qiu
Qi Li
Xingzi Xu
M. S. Seyfioglu
Kavosh Asadi
Karim Bouyarmane
DiffM
156
3
0
24 Sep 2025
OmniBridge: Unified Multimodal Understanding, Generation, and Retrieval via Latent Space Alignment
Teng Xiao
Zuchao Li
Lefei Zhang
178
1
0
23 Sep 2025
Text Slider: Efficient and Plug-and-Play Continuous Concept Control for Image/Video Synthesis via LoRA Adapters
Pin-Yen Chiu
I-Sheng Fang
Jun-Cheng Chen
DiffM
124
0
0
23 Sep 2025
Synthesizing Artifact Dataset for Pixel-level Detection
Dennis Menn
Feng Liang
Diana Marculescu
104
0
0
23 Sep 2025
How Far are VLMs from Visual Spatial Intelligence? A Benchmark-Driven Perspective
S. Yu
Yuxin Chen
Hao Ju
Lianjie Jia
Fuxi Zhang
...
Lin Song
Lijun Wang
Yanwei Li
Y. Shan
Huchuan Lu
LRM
319
9
0
23 Sep 2025
Training-Free Multi-Style Fusion Through Reference-Based Adaptive Modulation
Xu Liu
Yibo Lu
Xinxian Wang
Xinyu Wu
DiffM
134
4
0
23 Sep 2025
AGSwap: Overcoming Category Boundaries in Object Fusion via Adaptive Group Swapping
Zedong Zhang
Ying Tai
J. Qian
Jian Yang
Jun Yu Li
222
0
0
23 Sep 2025
Automated Prompt Generation for Creative and Counterfactual Text-to-image Synthesis
Aleksa Jelaca
Ying Jiao
Chang Tian
Marie-Francine Moens
DiffM
80
0
0
23 Sep 2025
Seg4Diff: Unveiling Open-Vocabulary Segmentation in Text-to-Image Diffusion Transformers
Chaehyun Kim
Heeseong Shin
Eunbeen Hong
Heeji Yoon
Anurag Arnab
Paul Hongsuck Seo
Sunghwan Hong
Seungryong Kim
184
6
0
22 Sep 2025
Audio Super-Resolution with Latent Bridge Models
Chang Li
Zehua Chen
Liyuan Wang
Jun Zhu
331
3
0
22 Sep 2025
MEF: A Systematic Evaluation Framework for Text-to-Image Models
Xiaojing Dong
Weilin Huang
Liang Li
Y. Li
Shu Liu
Tongtong Ou
Shuang Ouyang
Yu Tian
Fengxuan Zhao
EGVM
158
0
0
22 Sep 2025
ComposeMe: Attribute-Specific Image Prompts for Controllable Human Image Generation
Guocheng Qian
Daniil Ostashev
Egor Nemchinov
Avihay Assouline
Sergey Tulyakov
Kuan-Chien Wang
Kfir Aberman
DiffM
223
5
0
22 Sep 2025
Semantic and Visual Crop-Guided Diffusion Models for Heterogeneous Tissue Synthesis in Histopathology
Saghir Alfasly
Wataru Uegami
MD Enamul Hoq
Ghazal Alabtah
H. R. Tizhoosh
DiffM
MedIm
294
2
0
22 Sep 2025
Multi-scale Temporal Prediction via Incremental Generation and Multi-agent Collaboration
Zhitao Zeng
Guojian Yuan
Junyuan Mao
Yuxuan Wang
Xiaoshuang Jia
Yueming Jin
252
0
0
22 Sep 2025
Single-Image Depth from Defocus with Coded Aperture and Diffusion Posterior Sampling
Hodaka Kawachi
Jose Reinaldo Cunha Santos A. V. Silva Neto
Y. Yagi
Hajime Nagahara
Tomoya Nakamura
DiffM
124
0
0
22 Sep 2025
Spiffy: Multiplying Diffusion LLM Acceleration via Lossless Speculative Decoding
Sudhanshu Agrawal
Risheek Garrepalli
Raghavv Goel
Mingu Lee
Christopher Lott
Fatih Porikli
208
6
0
22 Sep 2025
Multimodal Prompt Decoupling Attack on the Safety Filters in Text-to-Image Models
Xingkai Peng
Jun Jiang
Meng Tong
Shuai Li
Weiming Zhang
Nenghai Yu
Kejiang Chen
128
0
0
21 Sep 2025
Stencil: Subject-Driven Generation with Context Guidance
International Conference on Information Photonics (ICIP), 2025
Gordon Chen
Ziqi Huang
Cheston Tan
Ziwei Liu
DiffM
130
0
0
21 Sep 2025
PRISM: Precision-Recall Informed Data-Free Knowledge Distillation via Generative Diffusion
Xuewan He
Jielei Wang
Zihan Cheng
Yuchen Su
Shiyue Huang
Guoming Lu
DiffM
145
0
0
21 Sep 2025
Follow-Your-Emoji-Faster: Towards Efficient, Fine-Controllable, and Expressive Freestyle Portrait Animation
Yue Ma
Zexuan Yan
Hongyu Liu
H. Wang
Heng Pan
...
H. Shum
Zhifeng Li
Wei Liu
Linfeng Zhang
Qifeng Chen
VGen
267
13
0
20 Sep 2025
InstanceAssemble: Layout-Aware Image Generation via Instance Assembling Attention
Qiang Xiang
Shuang Sun
Binglei Li
Dejia Song
Huaxia Li
Nemo Chen
Xu Tang
Yao Hu
Junping Zhang
DiffM
296
1
0
20 Sep 2025
MANZANO: A Simple and Scalable Unified Multimodal Model with a Hybrid Vision Tokenizer
Yanghao Li
Rui Qian
Bowen Pan
Haotian Zhang
Haoshuo Huang
...
Zhengdong Zhang
Chen Chen
Yang Zhao
Ruoming Pang
Zhifeng Chen
MLLM
204
4
0
19 Sep 2025
PolyJuice Makes It Real: Black-Box, Universal Red Teaming for Synthetic Image Detectors
Sepehr Dehdashtian
Mashrur M. Morshed
Jacob H. Seidman
Gaurav Bharaj
Vishnu Boddeti
AAML
DiffM
184
0
0
19 Sep 2025
Lynx: Towards High-Fidelity Personalized Video Generation
S. Sang
Tiancheng Zhi
Tianpei Gu
Jing Liu
Linjie Luo
DiffM
VGen
208
3
0
19 Sep 2025
CIDER: A Causal Cure for Brand-Obsessed Text-to-Image Models
Fangjian Shen
Zifeng Liang
Chao Wang
Wushao Wen
DiffM
124
0
0
19 Sep 2025
DiffusionNFT: Online Diffusion Reinforcement with Forward Process
Kaiwen Zheng
Huayu Chen
Haotian Ye
Haoxiang Wang
Qinsheng Zhang
Kai Jiang
Hang Su
Stefano Ermon
Jun Zhu
Ming-Yu Liu
240
11
0
19 Sep 2025
The Iconicity of the Generated Image
Nanne van Noord
Noa Garcia
152
0
0
19 Sep 2025
Diffusion-Based Cross-Modal Feature Extraction for Multi-Label Classification
Tian Lan
Yiming Zheng
Jianxin Yin
152
0
0
19 Sep 2025
Causal Reasoning Elicits Controllable 3D Scene Generation
Shen Chen
Ruiyu Zhao
Jiale Zhou
Zongkai Wu
Jenq-Neng Hwang
Lei Li
3DV
110
0
0
18 Sep 2025
AutoEdit: Automatic Hyperparameter Tuning for Image Editing
Chau Pham
Quan Dao
Mahesh Bhosale
Yunjie Tian
Dimitris Metaxas
David Doermann
189
1
0
18 Sep 2025
LowDiff: Efficient Diffusion Sampling with Low-Resolution Condition
Jiuyi Xu
Qing Jin
Meida Chen
Andrew Feng
Yang Sui
Yangming Shi
DiffM
156
0
0
18 Sep 2025
Geometric Image Synchronization with Deep Watermarking
Pierre Fernandez
Tomáš Souček
Nikola Jovanović
Hady ElSahar
Sylvestre-Alvise Rebuffi
Valeriu Lacatusu
Tuan Tran
Alexandre Mourachko
WIGM
321
1
0
18 Sep 2025
Noise-Level Diffusion Guidance: Well Begun is Half Done
Harvey Mannering
Zhiwu Huang
Adam Prugel-Bennett
DiffM
162
0
0
17 Sep 2025
BWCache: Accelerating Video Diffusion Transformers through Block-Wise Caching
Hanshuai Cui
Zhiqing Tang
Zhifei Xu
Zhi Yao
Wenyi Zeng
Weijia Jia
VGen
301
0
0
17 Sep 2025
Towards Robust Defense against Customization via Protective Perturbation Resistant to Diffusion-based Purification
Wenkui Yang
Jie Cao
Junxian Duan
Ran He
DiffM
AAML
WIGM
286
0
0
17 Sep 2025
BiasMap: Leveraging Cross-Attentions to Discover and Mitigate Hidden Social Biases in Text-to-Image Generation
Rajatsubhra Chakraborty
Xujun Che
Depeng Xu
Cori Faklaris
Xi Niu
Shuhan Yuan
104
0
0
16 Sep 2025
Adaptive Sampling Scheduler
Qi Wang
Shuliang Zhu
Jinjia Zhou
DiffM
72
0
0
16 Sep 2025
Double Helix Diffusion for Cross-Domain Anomaly Image Generation
Linchun Wu
Qin Zou
Xianbiao Qi
Bo Du
Zhongyuan Wang
Qingquan Li
168
0
0
16 Sep 2025
MIA-EPT: Membership Inference Attack via Error Prediction for Tabular Data
Eyal German
Daniel Samira
Yuval Elovici
A. Shabtai
208
0
0
16 Sep 2025
SpeCa: Accelerating Diffusion Transformers with Speculative Feature Caching
Jiacheng Liu
Chang Zou
Yuanhuiyi Lyu
Fei Ren
Shaobo Wang
Kaixin Li
Linfeng Zhang
DiffM
214
5
0
15 Sep 2025
Flow Straight and Fast in Hilbert Space: Functional Rectified Flow
Jianxin Zhang
Clayton Scott
140
1
0
12 Sep 2025
A Discrepancy-Based Perspective on Dataset Condensation
Tong Chen
Raghavendra Selvan
DD
261
0
0
12 Sep 2025
Compute Only 16 Tokens in One Timestep: Accelerating Diffusion Transformers with Cluster-Driven Feature Caching
Zhixin Zheng
Xinyu Wang
Chang Zou
Shaobo Wang
Linfeng Zhang
144
2
0
12 Sep 2025
InfGen: A Resolution-Agnostic Paradigm for Scalable Image Synthesis
Tao Han
Wanghan Xu
Junchao Gong
Xiaoyu Yue
Song Guo
Luping Zhou
Lei Bai
122
1
0
12 Sep 2025
Maestro: Self-Improving Text-to-Image Generation via Agent Orchestration
Xingchen Wan
Han Zhou
Ruoxi Sun
Hootan Nakhost
Ke Jiang
Rajarishi Sinha
Sercan Ö. Arık
246
4
0
12 Sep 2025
MagicMirror: A Large-Scale Dataset and Benchmark for Fine-Grained Artifacts Assessment in Text-to-Image Generation
Jia Wang
Jie Hu
Xiaoqi Ma
Hanghang Ma
Yanbing Zeng
Xiaoming Wei
EGVM
VGen
188
1
0
12 Sep 2025
Previous
1
2
3
...
5
6
7
...
99
100
101
Next