Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2205.11487
Cited By
Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding
Neural Information Processing Systems (NeurIPS), 2022
23 May 2022
Chitwan Saharia
William Chan
Saurabh Saxena
Lala Li
Jay Whang
Emily L. Denton
Seyed Kamyar Seyed Ghasemipour
Burcu Karagol Ayan
S. S. Mahdavi
Raphael Gontijo-Lopes
Tim Salimans
Jonathan Ho
David J Fleet
Mohammad Norouzi
VLM
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (1 upvotes)
Papers citing
"Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding"
50 / 5,056 papers shown
Maestro: Self-Improving Text-to-Image Generation via Agent Orchestration
Xingchen Wan
Han Zhou
Ruoxi Sun
Hootan Nakhost
Ke Jiang
Rajarishi Sinha
Sercan Ö. Arık
310
6
0
12 Sep 2025
Flow Straight and Fast in Hilbert Space: Functional Rectified Flow
Jianxin Zhang
Clayton Scott
192
1
0
12 Sep 2025
InfGen: A Resolution-Agnostic Paradigm for Scalable Image Synthesis
Tao Han
Wanghan Xu
Junchao Gong
Xiaoyu Yue
Song Guo
Luping Zhou
Lei Bai
152
2
0
12 Sep 2025
T2Bs: Text-to-Character Blendshapes via Video Generation
Jiahao Luo
Chaoyang Wang
Michael Vasilkovsky
V. Shakhrai
Di Liu
...
Sergey Tulyakov
Peter Wonka
Hsin-Ying Lee
James Davis
Jian Wang
DiffM
249
1
0
12 Sep 2025
Compute Only 16 Tokens in One Timestep: Accelerating Diffusion Transformers with Cluster-Driven Feature Caching
Zhixin Zheng
Xinyu Wang
Chang Zou
Shaobo Wang
Linfeng Zhang
186
8
0
12 Sep 2025
A Discrepancy-Based Perspective on Dataset Condensation
Tong Chen
Raghavendra Selvan
DD
304
1
0
12 Sep 2025
MagicMirror: A Large-Scale Dataset and Benchmark for Fine-Grained Artifacts Assessment in Text-to-Image Generation
Jia Wang
Jie Hu
Xiaoqi Ma
Hanghang Ma
Yanbing Zeng
Xiaoming Wei
EGVM
VGen
254
2
0
12 Sep 2025
Composable Score-based Graph Diffusion Model for Multi-Conditional Molecular Generation
Anjie Qiao
Zhen Wang
Chuan Chen
Defu Lian
Tong Xu
DiffM
284
0
0
11 Sep 2025
Bridging the Gap Between Ideal and Real-world Evaluation: Benchmarking AI-Generated Image Detection in Challenging Scenarios
Chunxiao Li
Xiaoxiao Wang
Meiling Li
Boming Miao
Peng Sun
Yunjian Zhang
Xiangyang Ji
Yao Zhu
190
2
0
11 Sep 2025
Region-Wise Correspondence Prediction between Manga Line Art Images
Yingxuan Li
Jiafeng Mao
Qianru Qiu
Yusuke Matsui
ViT
226
0
0
11 Sep 2025
Discovering Divergent Representations between Text-to-Image Models
Lisa Dunlap
Joseph E. Gonzalez
Trevor Darrell
Fabian Caba Heilbron
Josef Sivic
Bryan C. Russell
EGVM
172
0
0
10 Sep 2025
Calibrating MLLM-as-a-judge via Multimodal Bayesian Prompt Ensembles
Eric Slyman
Mehrab Tanjim
Kushal Kafle
Stefan Lee
212
0
0
10 Sep 2025
ForTIFAI: Fending Off Recursive Training Induced Failure for AI Model Collapse
Soheil Zibakhsh Shabgahi
Pedram Aghazadeh
Azalia Mirhoseini
F. Koushanfar
318
0
0
10 Sep 2025
Universal Few-Shot Spatial Control for Diffusion Models
Kiet T. Nguyen
Chanhuyk Lee
Donggyun Kim
Dong Hoon Lee
Seunghoon Hong
179
0
0
09 Sep 2025
Semantic Watermarking Reinvented: Enhancing Robustness and Generation Quality with Fourier Integrity
Sung Ju Lee
Nam Ik Cho
AAML
279
7
0
09 Sep 2025
UMO: Scaling Multi-Identity Consistency for Image Customization via Matching Reward
Yufeng Cheng
Wenxu Wu
Shaojin Wu
Mengqi Huang
Fei Ding
Qian He
133
10
0
08 Sep 2025
TIDE: Achieving Balanced Subject-Driven Image Generation via Target-Instructed Diffusion Enhancement
Jibai Lin
B. Ma
Yating Yang
Xi Zhou
Rong Ma
Turghun Osman
Ahtamjan Ahmat
Rui Dong
Lei Wang
181
0
0
08 Sep 2025
Scaling Transformer-Based Novel View Synthesis Models with Token Disentanglement and Synthetic Data
Nithin Gopalakrishnan Nair
Srinivas Kaza
Xuan Luo
Vishal M. Patel
Stephen Lombardi
Jungyeon Park
124
0
0
08 Sep 2025
DreamAudio: Customized Text-to-Audio Generation with Diffusion Models
Yi Yuan
Xubo Liu
Haohe Liu
Xiyuan Kang
Zhuo Chen
Yuping Wang
Mark D. Plumbley
Wenwu Wang
180
2
0
07 Sep 2025
Learning in ImaginationLand: Omnidirectional Policies through 3D Generative Models (OP-Gen)
Yifei Ren
Edward Johns
LM&Ro
219
2
0
07 Sep 2025
Tell-Tale Watermarks for Explanatory Reasoning in Synthetic Media Forensics
Ching-Chun Chang
Isao Echizen
WIGM
239
0
0
06 Sep 2025
SuMa: A Subspace Mapping Approach for Robust and Effective Concept Erasure in Text-to-Image Diffusion Models
K. Nguyen
Anh Tran
Cuong Pham
172
0
0
06 Sep 2025
A Scalable Attention-Based Approach for Image-to-3D Texture Mapping
Arianna Rampini
Kanika Madan
Bruno Roy
AmirHossein Zamani
Derek Cheung
148
0
0
05 Sep 2025
UniView: Enhancing Novel View Synthesis From A Single Image By Unifying Reference Features
Haowang Cui
Rui Chen
Tao Luo
Rui Li
Jiaze Wang
180
0
0
05 Sep 2025
SynGen-Vision: Synthetic Data Generation for training industrial vision models
Alpana Dubey
Suma Mani Kuriakose
Nitish Bhardwaj
143
0
0
05 Sep 2025
From Editor to Dense Geometry Estimator
Jiyuan Wang
Chunyu Lin
Lei-huan Sun
Rongying Liu
Lang Nie
Mingxing Li
K. Liao
Xiangxiang Chu
DiffM
MDE
318
11
0
04 Sep 2025
MEPG:Multi-Expert Planning and Generation for Compositionally-Rich Image Generation
Yuan Zhao
Lin Liu
DiffM
MoE
242
0
0
04 Sep 2025
Wavelet Fourier Diffuser: Frequency-Aware Diffusion Model for Reinforcement Learning
Yifu Luo
Yongzhe Chang
Xueqian Wang
187
2
0
04 Sep 2025
The Telephone Game: Evaluating Semantic Drift in Unified Models
Sabbir Mollah
Rohit Gupta
S. Swetha
Qingyang Liu
Ahnaf Munir
Mubarak Shah
VLM
233
2
0
04 Sep 2025
PromptEnhancer: A Simple Approach to Enhance Text-to-Image Models via Chain-of-Thought Prompt Rewriting
Linqing Wang
Ximing Xing
Yiji Cheng
Zhiyuan Zhao
Donghao Li
...
Chunyu Wang
Xinchi Deng
S. Gu
C. Wang
Qinglin Lu
543
18
0
04 Sep 2025
Inpaint4Drag: Repurposing Inpainting Models for Drag-Based Image Editing via Bidirectional Warping
Jingyi Lu
Kai Han
DiffM
227
3
0
04 Sep 2025
LuxDiT: Lighting Estimation with Video Diffusion Transformer
Ruofan Liang
Kai He
Zan Gojcic
Igor Gilitschenski
Sanja Fidler
Nandita Vijaykumar
Zian Wang
ViT
VGen
170
6
0
03 Sep 2025
Human Preference-Aligned Concept Customization Benchmark via Decomposed Evaluation
Reina Ishikawa
Ryo Fujii
Hideo Saito
Ryo Hachiuma
224
1
0
03 Sep 2025
TeRA: Rethinking Text-driven Realistic 3D Avatar Generation
Yanwen Wang
Yiyu Zhuang
Jiawei Zhang
Li Wang
Yifei Zeng
X. Cao
Xinxin Zuo
Hao Zhu
214
1
0
02 Sep 2025
Understanding Space Is Rocket Science -- Only Top Reasoning Models Can Solve Spatial Understanding Tasks
Nils Hoehing
Mayug Maniparambil
Ellen Rushe
Noel E. O'Connor
Anthony Ventresque
LRM
257
0
0
02 Sep 2025
Palette Aligned Image Diffusion
Elad Aharoni
Noy Porat
Dani Lischinski
Ariel Shamir
DiffM
VLM
142
0
0
02 Sep 2025
DynaMind: Reconstructing Dynamic Visual Scenes from EEG by Aligning Temporal Dynamics and Multimodal Semantics to Guided Diffusion
Junxiang Liu
Junming Lin
Jiangtong Li
Jie Li
DiffM
VGen
140
1
0
01 Sep 2025
FICGen: Frequency-Inspired Contextual Disentanglement for Layout-driven Degraded Image Generation
Wenzhuang Wang
Yifan Zhao
Mingcan Ma
Ming-Yuan Liu
Zhonglin Jiang
Yong Chen
Jia Li
DiffM
165
3
0
01 Sep 2025
CompSlider: Compositional Slider for Disentangled Multiple-Attribute Image Generation
Zixin Zhu
Kevin Duarte
Mamshad Nayeem Rizve
Chengyuan Xu
Ratheesh Kalarot
Junsong Yuan
DiffM
267
1
0
31 Aug 2025
Partially Functional Dynamic Backdoor Diffusion-based Causal Model
Xinwen Liu
Lei Qian
Song Xi Chen
Niansheng Tang
201
0
0
30 Aug 2025
Category-level Text-to-Image Retrieval Improved: Bridging the Domain Gap with Diffusion Models and Vision Encoders
Faizan Farooq Khan
Vladan Stojnić
Zakaria Laskar
Mohamed Elhoseiny
Giorgos Tolias
DiffM
VLM
122
0
0
29 Aug 2025
FLORA: Efficient Synthetic Data Generation for Object Detection in Low-Data Regimes via finetuning Flux LoRA
Alvaro Patricio
Atabak Dehban
Rodrigo Ventura
233
1
0
29 Aug 2025
Revisiting Deepfake Detection: Chronological Continual Learning and the Limits of Generalization
Federico Fontana
Anxhelo Diko
Romeo Lanzino
Marco Raoul Marini
Bachir Kaddar
G. Foresti
Luigi Cinque
116
0
0
29 Aug 2025
Attacks on Approximate Caches in Text-to-Image Diffusion Models
Desen Sun
Shuncheng Jie
Sihang Liu
DiffM
213
0
0
28 Aug 2025
Audio-Guided Visual Editing with Complex Multi-Modal Prompts
Hyeonyu Kim
Seokhoon Jeong
Seonghee Han
Chanhyuk Choi
Taehwan Kim
DiffM
169
0
0
28 Aug 2025
Reusing Computation in Text-to-Image Diffusion for Efficient Generation of Image Sets
Dale Decatur
Thibault Groueix
Wang Yifan
Rana Hanocka
Vladimir G. Kim
Matheus Gadelha
170
0
0
28 Aug 2025
Pref-GRPO: Pairwise Preference Reward-based GRPO for Stable Text-to-Image Reinforcement Learning
Y. Wang
Zhimin Li
Yuhang Zang
Yujie Zhou
Jiazi Bu
Chunyu Wang
Qinglin Lu
Cheng Jin
Jiaqi Wang
EGVM
282
55
0
28 Aug 2025
FastMesh: Efficient Artistic Mesh Generation via Component Decoupling
Jeonghwan Kim
Yushi Lan
Armando Fortes
Yongwei Chen
Xingang Pan
283
4
0
26 Aug 2025
VoxHammer: Training-Free Precise and Coherent 3D Editing in Native 3D Space
Lin Li
Zehuan Huang
Haoran Feng
Gengxiong Zhuang
Rui Chen
Chunchao Guo
Lu Sheng
DiffM
VGen
263
21
0
26 Aug 2025
Generative AI in Map-Making: A Technical Exploration and Its Implications for Cartographers
Claudio Affolter
Sidi Wu
Yizi Chen
L. Hurni
216
2
0
26 Aug 2025
Previous
1
2
3
...
6
7
8
...
100
101
102
Next
Page 7 of 102
Page
of 102
Go