Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2211.09800
Cited By
v1
v2 (latest)
InstructPix2Pix: Learning to Follow Image Editing Instructions
Computer Vision and Pattern Recognition (CVPR), 2022
17 November 2022
Tim Brooks
Aleksander Holynski
Alexei A. Efros
DiffM
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (4 upvotes)
Papers citing
"InstructPix2Pix: Learning to Follow Image Editing Instructions"
50 / 1,731 papers shown
Regressor-Guided Generative Image Editing Balances User Emotions to Reduce Time Spent Online
Christoph Gebhardt
Robin Willardt
Seyedmorteza Sadat
Chih-Wei Ning
Andreas Brombach
Jie Song
Otmar Hilliges
Christian Holz
255
0
0
24 Dec 2025
LAMIC: Layout-Aware Multi-Image Composition via Scalability of Multimodal Diffusion Transformer
Yuzhuo Chen
Zehua Ma
Jianhua Wang
Kai Kang
Shunyu Yao
Weiming Zhang
VLM
166
2
0
24 Dec 2025
Refaçade: Editing Object with Given Reference Texture
Youze Huang
Penghui Ruan
Bojia Zi
Xianbiao Qi
Jianan Wang
Rong Xiao
DiffM
175
0
0
04 Dec 2025
I2I-Bench: A Comprehensive Benchmark Suite for Image-to-Image Editing Models
Juntong Wang
Jiarui Wang
Huiyu Duan
Jiaxiang Kang
Guangtao Zhai
Xiongkuo Min
VLM
170
0
0
04 Dec 2025
GaussianBlender: Instant Stylization of 3D Gaussians with Disentangled Latent Spaces
Melis Ocal
Xiaoyan Xing
Yue Li
Ngo Anh Vien
Sezer Karaoglu
Theo Gevers
3DGS
129
0
0
03 Dec 2025
DirectDrag: High-Fidelity, Mask-Free, Prompt-Free Drag-based Image Editing via Readout-Guided Feature Alignment
Sheng-Hao Liao
Shang-Fu Chen
Tai-Ming Huang
Wen-Huang Cheng
Kai-Lung Hua
DiffM
132
0
0
03 Dec 2025
Zero-Shot Video Translation and Editing with Frame Spatial-Temporal Correspondence
Shuai Yang
J. Lin
Yifan Zhou
Ziwei Liu
Chen Change Loy
DiffM
VGen
227
0
0
03 Dec 2025
CAMEO: Correspondence-Attention Alignment for Multi-View Diffusion Models
Minkyung Kwon
J. Choi
Jiho Park
Seonghu Jeon
Jinhyuk Jang
Junyoung Seo
Minseop Kwak
Jin-Hwa Kim
Seungryong Kim
90
0
0
02 Dec 2025
UnicEdit-10M: A Dataset and Benchmark Breaking the Scale-Quality Barrier via Unified Verification for Reasoning-Enriched Edits
Keming Ye
Z. Huang
Canmiao Fu
Qingyang Liu
Jiani Cai
Zheqi Lv
Chen Li
Jing Lyu
Zhou Zhao
Shengyu Zhang
68
0
0
01 Dec 2025
TokenPure: Watermark Removal through Tokenized Appearance and Structural Guidance
Pei Yang
Y. Liu
Kelly Peng
Yuan Gao
Yiren Song
WIGM
193
0
0
01 Dec 2025
Generative Editing in the Joint Vision-Language Space for Zero-Shot Composed Image Retrieval
Xin Wang
H. Zhang
Mang Li
Zhaohui Xia
Y. Chen
Yu Zhang
Chunyu Wei
DiffM
147
0
0
01 Dec 2025
FreqEdit: Preserving High-Frequency Features for Robust Multi-Turn Image Editing
Yucheng Liao
Jiajun Liang
Kaiqian Cui
Baoquan Zhao
Haoran Xie
Wei Liu
Qing Li
Xudong Mao
126
0
0
01 Dec 2025
Reversible Inversion for Training-Free Exemplar-guided Image Editing
Yuke Li
Lianli Gao
Ji Zhang
Pengpeng Zeng
Lichuan Xiang
Hongkai Wen
Heng Tao Shen
Jingkuan Song
DiffM
129
0
0
01 Dec 2025
BioPro: On Difference-Aware Gender Fairness for Vision-Language Models
Y. Lin
Jiayao Ma
Qingguo Hu
Derek F. Wong
Jinsong Su
64
0
0
30 Nov 2025
Charts Are Not Images: On the Challenges of Scientific Chart Editing
Shawn Li
Ryan Rossi
Sungchul Kim
Sunav Choudhary
Franck Dernoncourt
Puneet Mathur
Zhengzhong Tu
Yue Zhao
69
0
0
30 Nov 2025
Dynamic-eDiTor: Training-Free Text-Driven 4D Scene Editing with Multimodal Diffusion Transformer
Dong In Lee
Hyungjun Doh
Seunggeun Chi
Runlin Duan
Sangpil Kim
K. Ramani
DiffM
3DGS
VGen
145
0
0
30 Nov 2025
POLARIS: Projection-Orthogonal Least Squares for Robust and Adaptive Inversion in Diffusion Models
Wenshuo Chen
Haosen Li
Shaofeng Liang
Lei Wang
Haozhe Jia
Kaishen Yuan
J. Wu
Bowen Tian
Yutao Yue
78
0
0
29 Nov 2025
RealGen: Photorealistic Text-to-Image Generation via Detector-Guided Rewards
Junyan Ye
Leiqi Zhu
Yuncheng Guo
Dongzhi Jiang
Zilong Huang
Yifan Zhang
Zhiyuan Yan
Haohuan Fu
Conghui He
Weijia Li
EGVM
117
0
0
29 Nov 2025
Vision Bridge Transformer at Scale
Zhenxiong Tan
Zeqing Wang
Xingyi Yang
Songhua Liu
Xinchao Wang
DiffM
100
0
0
28 Nov 2025
Fast Multi-view Consistent 3D Editing with Video Priors
Liyi Chen
Ruihuang Li
Guowen Zhang
Pengfei Wang
Lei Zhang
DiffM
VGen
223
1
0
28 Nov 2025
MultiBanana: A Challenging Benchmark for Multi-Reference Text-to-Image Generation
Yuta Oshima
Daiki Miyake
Kohsei Matsutani
Yusuke Iwasawa
Masahiro Suzuki
Yutaka Matsuo
Hiroki Furuta
59
0
0
28 Nov 2025
DEAL-300K: Diffusion-based Editing Area Localization with a 300K-Scale Dataset and Frequency-Prompted Baseline
Rui Zhang
Hongxia Wang
Hangqing Liu
Yang Zhou
Q. Zeng
91
0
0
28 Nov 2025
JarvisEvo: Towards a Self-Evolving Photo Editing Agent with Synergistic Editor-Evaluator Optimization
Yunlong Lin
Linqing Wang
Kunjie Lin
Zixu Lin
Kaixiong Gong
...
Yuyang Peng
Wenxun Dai
Xinghao Ding
C. Wang
Qinglin Lu
245
0
0
28 Nov 2025
One-to-All Animation: Alignment-Free Character Animation and Image Pose Transfer
S. Shi
Jing Xu
Zhihang Li
Chunli Peng
Xiaoda Yang
Lijing Lu
Kai Hu
Jiangning Zhang
DiffM
123
0
0
28 Nov 2025
3D-Consistent Multi-View Editing by Diffusion Guidance
Josef Bengtson
David Nilsson
Dong In Lee
Fredrik Kahl
3DGS
123
0
0
27 Nov 2025
ReasonEdit: Towards Reasoning-Enhanced Image Editing Models
Fukun Yin
Shiyu Liu
Yucheng Han
Zhibo Wang
Peng Xing
...
Pengtao Chen
Xiangyu Zhang
Daxin Jiang
Xianfang Zeng
Gang Yu
DiffM
KELM
LRM
241
0
0
27 Nov 2025
Match-and-Fuse: Consistent Generation from Unstructured Image Sets
Kate Feingold
Omri Kaduri
Tali Dekel
DiffM
VGen
86
0
0
27 Nov 2025
DiffStyle360: Diffusion-Based 360° Head Stylization via Style Fusion Attention
Furkan Guzelant
Arda Goktogan
Tarık Kaya
Aysegül Dündar
72
0
0
27 Nov 2025
PG-ControlNet: A Physics-Guided ControlNet for Generative Spatially Varying Image Deblurring
Hakki Motorcu
Mujdat Cetin
DiffM
239
0
0
26 Nov 2025
SpatialBench: Benchmarking Multimodal Large Language Models for Spatial Cognition
Peiran Xu
Sudong Wang
Yao Zhu
Jianing Li
Yunjian Zhang
LRM
343
1
0
26 Nov 2025
CameraMaster: Unified Camera Semantic-Parameter Control for Photography Retouching
Qirui Yang
Yang Yang
Ying Zeng
Xiaobin Hu
Bo Li
Huanjing Yue
Jingyu Yang
P. Jiang
DiffM
VGen
312
0
0
26 Nov 2025
MIRA: Multimodal Iterative Reasoning Agent for Image Editing
Ziyun Zeng
Hang Hua
Jiebo Luo
KELM
LM&Ro
LRM
357
0
0
26 Nov 2025
MUSE: Manipulating Unified Framework for Synthesizing Emotions in Images via Test-Time Optimization
Yingjie Xia
X. Wang
Jinglei Shi
Vicky Kalogeiton
Jian Yang
EGVM
VGen
546
0
0
26 Nov 2025
A Training-Free Approach for Multi-ID Customization via Attention Adjustment and Spatial Control
Jiawei Lin
Guanlong Jiao
Jianjin Xu
277
0
0
25 Nov 2025
Low-Resolution Editing is All You Need for High-Resolution Editing
J. Lee
Hyunsoo Lee
Yong Jae Lee
Bohyung Han
DiffM
222
0
0
25 Nov 2025
HBridge: H-Shape Bridging of Heterogeneous Experts for Unified Multimodal Understanding and Generation
Xiang Wang
Zhifei Zhang
Chentao Song
Zhe Lin
Yuqian Zhou
...
Haitian Zheng
Jason Kuen
Yuehuan Wang
Changxin Gao
Nong Sang
MoE
172
0
0
25 Nov 2025
TReFT: Taming Rectified Flow Models For One-Step Image Translation
Shengqian Li
Ming Gao
Y. Liu
Zuzeng Lin
Feng Wang
Feng Dai
144
0
0
25 Nov 2025
Are Image-to-Video Models Good Zero-Shot Image Editors?
Zechuan Zhang
Zhenyuan Chen
Zongxin Yang
Yi Yang
DiffM
VGen
560
0
0
24 Nov 2025
MonoMSK: Monocular 3D Musculoskeletal Dynamics Estimation
Farnoosh Koleini
Hongfei Xue
Ahmed Helmy
Pu Wang
244
1
0
24 Nov 2025
ReCoGS: Real-time ReColoring for Gaussian Splatting scenes
Lorenzo Rutayisire
Nicola Capodieci
Fabio Pellacini
3DGS
126
0
0
23 Nov 2025
IE-Critic-R1: Advancing the Explanatory Measurement of Text-Driven Image Editing for Human Perception Alignment
Bowen Qu
Shangkun Sun
Xiaoyu Liang
Wei-Nan Gao
94
0
0
22 Nov 2025
Counterfactual World Models via Digital Twin-conditioned Video Diffusion
Yiqing Shen
Aiza Maksutova
Chenjia Li
Mathias Unberath
DiffM
VGen
165
0
0
21 Nov 2025
Native 3D Editing with Full Attention
Weiwei Cai
Shuangkang Fang
Weicai Ye
Xin Dong
Y. Yang
Xuanyang Zhang
Wei Cheng
Yanpei Cao
Gang Yu
Tao Chen
DiffM
127
0
0
21 Nov 2025
SPIDER: Spatial Image CorresponDence Estimator for Robust Calibration
Zhimin Shao
Abhay Kumar Yadav
Rama Chellappa
Cheng-Fang Peng
81
0
0
21 Nov 2025
Show Me: Unifying Instructional Image and Video Generation with Diffusion Models
Yujiang Pu
Zhanbo Huang
Vishnu Boddeti
Yu Kong
DiffM
VGen
118
0
0
21 Nov 2025
DeltaDeno: Zero-Shot Anomaly Generation via Delta-Denoising Attribution
C. Xu
Chengkan Lv
Qiyu Chen
Yunkang Cao
Feng Zhang
Zhengtao Zhang
DiffM
281
0
0
21 Nov 2025
T2T-VICL: Unlocking the Boundaries of Cross-Task Visual In-Context Learning via Implicit Text-Driven VLMs
Shao-Jun Xia
Huixin Zhang
Zhengzhong Tu
MLLM
VLM
424
0
0
20 Nov 2025
NaTex: Seamless Texture Generation as Latent Color Diffusion
Zeqiang Lai
Yunfei Zhao
Zibo Zhao
Xin Yang
Xin Huang
J. Huang
Xiangyu Yue
Chunchao Guo
DiffM
175
0
0
20 Nov 2025
Thinking-while-Generating: Interleaving Textual Reasoning throughout Visual Generation
Ziyu Guo
Renrui Zhang
Hongyu Li
M. Zhang
Xinyan Chen
Sifan Wang
Yan Feng
Peng Pei
Pheng-Ann Heng
245
4
0
20 Nov 2025
SplitFlux: Learning to Decouple Content and Style from a Single Image
Yitong Yang
Y Samuel Wang
Changshuo Wang
Yongjun Zhang
Ziyang Chen
Shuting He
213
0
0
19 Nov 2025
1
2
3
4
...
33
34
35
Next