Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2205.11487
Cited By
Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding
Neural Information Processing Systems (NeurIPS), 2022
23 May 2022
Chitwan Saharia
William Chan
Saurabh Saxena
Lala Li
Jay Whang
Emily L. Denton
Seyed Kamyar Seyed Ghasemipour
Burcu Karagol Ayan
S. S. Mahdavi
Raphael Gontijo-Lopes
Tim Salimans
Jonathan Ho
David J Fleet
Mohammad Norouzi
VLM
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (1 upvotes)
Papers citing
"Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding"
50 / 5,040 papers shown
Refaçade: Editing Object with Given Reference Texture
Youze Huang
Penghui Ruan
Bojia Zi
Xianbiao Qi
Jianan Wang
Rong Xiao
DiffM
175
0
0
04 Dec 2025
PosterCopilot: Toward Layout Reasoning and Controllable Editing for Professional Graphic Design
Jiazhe Wei
Ken Li
Tianyu Lao
Haofan Wang
Liang Wang
Caifeng Shan
Chenyang Si
97
0
0
03 Dec 2025
RNNs perform task computations by dynamically warping neural representations
Arthur Pellegrino
Angus Chadwick
43
1
0
03 Dec 2025
DirectDrag: High-Fidelity, Mask-Free, Prompt-Free Drag-based Image Editing via Readout-Guided Feature Alignment
Sheng-Hao Liao
Shang-Fu Chen
Tai-Ming Huang
Wen-Huang Cheng
Kai-Lung Hua
DiffM
132
0
0
03 Dec 2025
GeoVideo: Introducing Geometric Regularization into Video Generation Model
Yunpeng Bai
Shaoheng Fang
Chaohui Yu
Fan Wang
Qixing Huang
DiffM
VGen
MDE
454
2
0
03 Dec 2025
Towards Irreversible Machine Unlearning for Diffusion Models
Xun Yuan
Zilong Zhao
Jiayu Li
A. Pasikhani
P. Gope
Biplab Sikdar
165
0
0
03 Dec 2025
Zero-Shot Video Translation and Editing with Frame Spatial-Temporal Correspondence
Shuai Yang
J. Lin
Yifan Zhou
Ziwei Liu
Chen Change Loy
DiffM
VGen
227
0
0
03 Dec 2025
Stable Signer: Hierarchical Sign Language Generative Model
Sen Fang
Yalin Feng
Hongbin Zhong
Yanxin Zhang
Dimitris N. Metaxas
SLR
356
0
0
03 Dec 2025
U4D: Uncertainty-Aware 4D World Modeling from LiDAR Sequences
Xiang Xu
Ao Liang
Youquan Liu
Linfeng Li
Lingdong Kong
Ziwei Liu
Qingshan Liu
138
1
0
02 Dec 2025
Video Diffusion Models Excel at Tracking Similar-Looking Objects Without Supervision
Chenshuang Zhang
Kang Zhang
Joon Son Chung
In So Kweon
Junmo Kim
Chengzhi Mao
DiffM
234
0
0
02 Dec 2025
Does Hearing Help Seeing? Investigating Audio-Video Joint Denoising for Video Generation
Jianzong Wu
Hao Lian
Dachao Hao
Ye Tian
Qingyu Shi
Biaolong Chen
Hao Jiang
Yunhai Tong
DiffM
VGen
252
0
0
02 Dec 2025
Distill, Forget, Repeat: A Framework for Continual Unlearning in Text-to-Image Diffusion Models
Naveen George
Naoki Murata
Yuhta Takida
Konda Reddy Mopuri
Yuki Mitsufuji
DiffM
MU
379
0
0
02 Dec 2025
PaCo-RL: Advancing Reinforcement Learning for Consistent Image Generation with Pairwise Reward Modeling
Bowen Ping
Chengyou Jia
Minnan Luo
Changliang Xia
Xin Shen
Zhuohang Dang
Hangwei Qian
EGVM
70
0
0
02 Dec 2025
Understanding and Harnessing Sparsity in Unified Multimodal Models
Shwai He
Chaorui Deng
Ang Li
Shen Yan
MoE
212
1
0
02 Dec 2025
OmniPerson: Unified Identity-Preserving Pedestrian Generation
Changxiao Ma
Chao Yuan
Xincheng Shi
Yuzhuo Ma
Yongfei Zhang
Longkun Zhou
Yujia Zhang
Shangze Li
Yifan Xu
VGen
214
0
0
02 Dec 2025
TUNA: Taming Unified Visual Representations for Native Unified Multimodal Models
Zhiheng Liu
Weiming Ren
Haozhe Liu
Zijian Zhou
S. Chen
...
Ping Luo
Wei Liu
Tao Xiang
Jonas Schult
Yuren Cong
155
0
0
01 Dec 2025
PhyCustom: Towards Realistic Physical Customization in Text-to-Image Generation
Fan Wu
Cheng Chen
Zhoujie Fu
Jiacheng Wei
Yi Tian Xu
Deheng Ye
Guosheng Lin
DiffM
78
0
0
01 Dec 2025
Generative Editing in the Joint Vision-Language Space for Zero-Shot Composed Image Retrieval
Xin Wang
H. Zhang
Mang Li
Zhaohui Xia
Y. Chen
Yu Zhang
Chunyu Wei
DiffM
146
0
0
01 Dec 2025
CoatFusion: Controllable Material Coating in Images
Sagie Levy
Elad Aharoni
Matan Levy
Ariel Shamir
Dani Lischinski
DiffM
AI4CE
147
0
0
01 Dec 2025
Envision: Benchmarking Unified Understanding & Generation for Causal World Process Insights
Juanxi Tian
Siyuan Li
Conghui He
Lijun Wu
Cheng Tan
EGVM
VGen
163
0
0
01 Dec 2025
Deep Unsupervised Anomaly Detection in Brain Imaging: Large-Scale Benchmarking and Bias Analysis
Alexander Frötscher
Christian F. Baumgartner
T. Wolfers
OOD
235
0
0
01 Dec 2025
FineGRAIN: Evaluating Failure Modes of Text-to-Image Models with Vision Language Model Judges
Kevin David Hayes
Micah Goldblum
Vikash Sehwag
Gowthami Somepalli
Ashwinee Panda
Tom Goldstein
MLLM
EGVM
240
0
0
01 Dec 2025
Relightable Holoported Characters: Capturing and Relighting Dynamic Human Performance from Sparse Views
Kunwar Maheep Singh
Jianchun Chen
Vladislav Golyanik
Stephan Garbin
Thabo Beeler
Rishabh Dabral
Marc Habermann
Christian Theobalt
3DH
244
0
0
29 Nov 2025
CC-FMO: Camera-Conditioned Zero-Shot Single Image to 3D Scene Generation with Foundation Model Orchestration
Boshi Tang
Henry Zheng
Rui Huang
Gao Huang
VGen
191
0
0
29 Nov 2025
TARFVAE: Efficient One-Step Generative Time Series Forecasting via TARFLOW based VAE
Jiawen Wei
Lan Jiang
Pengbo Wei
Ziwen Ye
Teng Song
Chen Chen
Guangrui Ma
AI4TS
116
0
0
28 Nov 2025
CoordSpeaker: Exploiting Gesture Captioning for Coordinated Caption-Empowered Co-Speech Gesture Generation
Fengyi Fang
Sicheng Yang
Wenming Yang
SLR
220
0
0
28 Nov 2025
One-to-All Animation: Alignment-Free Character Animation and Image Pose Transfer
S. Shi
Jing Xu
Zhihang Li
Chunli Peng
Xiaoda Yang
Lijing Lu
Kai Hu
Jiangning Zhang
DiffM
122
0
0
28 Nov 2025
GOATex: Geometry & Occlusion-Aware Texturing
Hyunjin Kim
Kunho Kim
Adam Lee
Wonkwang Lee
DiffM
101
0
0
28 Nov 2025
REVEAL: Reasoning-enhanced Forensic Evidence Analysis for Explainable AI-generated Image Detection
Huangsen Cao
Qin Mei
Zhiheng Li
Yuxi Li
Ying Zhang
...
Zhimeng Zhang
Xin Ding
Yongwei Wang
Jing Lyu
Fei Wu
131
0
0
28 Nov 2025
LC4-DViT: Land-cover Creation for Land-cover Classification with Deformable Vision Transformer
Kai Wang
S. Chen
Weicong Pang
Chenchen Zhang
Renjun Gao
Z. Chen
Cheng Li
Dasa Gu
Rui Huang
Alexis Kai Hon Lau
ViT
64
0
0
27 Nov 2025
Convergence Dynamics of Over-Parameterized Score Matching for a Single Gaussian
Yiran Zhang
Weihang Xu
Mo Zhou
Maryam Fazel
S. S. Du
DiffM
143
0
0
27 Nov 2025
AI killed the video star. Audio-driven diffusion model for expressive talking head generation
Baptiste Chopin
Tashvik Dhamija
P. Balaji
Yaohui Wang
A. Dantcheva
DiffM
VGen
70
0
0
27 Nov 2025
Generative Anchored Fields: Controlled Data Generation via Emergent Velocity Fields and Transport Algebra
Deressa Wodajo Deressa
Hannes Mareen
Peter Lambert
Glenn Van Wallendael
64
0
0
27 Nov 2025
Ar2Can: An Architect and an Artist Leveraging a Canvas for Multi-Human Generation
Shubhankar Borse
Phuc Pham
Farzad Farhadzadeh
Seokeon Choi
P. Nguyen
Anh Tran
Sungrack Yun
Munawar Hayat
Fatih Porikli
78
0
0
27 Nov 2025
LaGen: Towards Autoregressive LiDAR Scene Generation
Sizhuo Zhou
Xiaosong Jia
Fanrui Zhang
Junjie Li
Juyong Zhang
Yukang Feng
Jianwen Sun
Songbur Wong
Junqi You
Junchi Yan
290
0
0
26 Nov 2025
Canvas-to-Image: Compositional Image Generation with Multimodal Controls
Yusuf Dalva
Guocheng Qian
Maya Goldenberg
Tsai-Shien Chen
Kfir Aberman
Sergey Tulyakov
Pinar Yanardag
Kuan-Chieh Wang
DiffM
199
0
0
26 Nov 2025
MUSE: Manipulating Unified Framework for Synthesizing Emotions in Images via Test-Time Optimization
Yingjie Xia
X. Wang
Jinglei Shi
Vicky Kalogeiton
Jian Yang
EGVM
VGen
546
0
0
26 Nov 2025
ShapeGen: Towards High-Quality 3D Shape Synthesis
Yangguang Li
Xianglong He
Zi-Xin Zou
Zexiang Liu
Wanli Ouyang
Ding Liang
Yan-Pei Cao
193
0
0
25 Nov 2025
Test-Time Alignment of Text-to-Image Diffusion Models via Null-Text Embedding Optimisation
Taehoon Kim
Henry Gouk
Timothy M. Hospedales
198
0
0
25 Nov 2025
HiCoGen: Hierarchical Compositional Text-to-Image Generation in Diffusion Models via Reinforcement Learning
Hongji Yang
Yucheng Zhou
Wencheng Han
Runzhou Tao
Zhongying Qiu
Jianfei Yang
Jianbing Shen
DiffM
EGVM
349
0
0
25 Nov 2025
PRADA: Probability-Ratio-Based Attribution and Detection of Autoregressive-Generated Images
Simon Damm
Jonas Ricker
Henning Petzka
Asja Fischer
189
0
0
25 Nov 2025
Low-Resolution Editing is All You Need for High-Resolution Editing
J. Lee
Hyunsoo Lee
Yong Jae Lee
Bohyung Han
DiffM
222
0
0
25 Nov 2025
Image Diffusion Models Exhibit Emergent Temporal Propagation in Videos
Youngseo Kim
Dohyun Kim
Geohee Han
Paul Hongsuck Seo
186
0
0
25 Nov 2025
iMontage: Unified, Versatile, Highly Dynamic Many-to-many Image Generation
Zhoujie Fu
Xianfang Zeng
Jinghong Lan
Xinyao Liao
Cheng Chen
...
Wei Cheng
Shiyu Liu
Y. Chen
Gang Yu
Guosheng Lin
DiffM
VGen
348
1
0
25 Nov 2025
Now You See It, Now You Don't - Instant Concept Erasure for Safe Text-to-Image and Video Generation
Shristi Das Biswas
Arani Roy
Kaushik Roy
VGen
265
0
0
24 Nov 2025
Demystifying Diffusion Objectives: Reweighted Losses are Better Variational Bounds
Jiaxin Shi
Michalis K. Titsias
DiffM
268
0
0
24 Nov 2025
Learning What to Trust: Bayesian Prior-Guided Optimization for Visual Generation
Ruiying Liu
Yuanzhi Liang
Haibin Huang
Tianshu Yu
Chi Zhang
101
0
0
24 Nov 2025
A Self-Conditioned Representation Guided Diffusion Model for Realistic Text-to-LiDAR Scene Generation
Wentao Qu
Guofeng Mei
Yang Wu
Yongshun Gong
Xiaoshui Huang
Liang Xiao
DiffM
186
0
0
24 Nov 2025
LAA3D: A Benchmark of Detecting and Tracking Low-Altitude Aircraft in 3D Space
Hai Wu
Shuai Tang
Jiale Wang
Longkun Zou
Mingyue Guo
Rongqin Liang
Ke Chen
Yaowei Wang
143
1
0
24 Nov 2025
DiP: Taming Diffusion Models in Pixel Space
Z. Chen
J. Zhu
Xu Chen
Jiangning Zhang
Xiaobin Hu
Hanzhen Zhao
C. Wang
Jian Yang
Ying Tai
289
0
0
24 Nov 2025
1
2
3
4
...
99
100
101
Next