ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2211.09800
  4. Cited By
InstructPix2Pix: Learning to Follow Image Editing Instructions
v1v2 (latest)

InstructPix2Pix: Learning to Follow Image Editing Instructions

Computer Vision and Pattern Recognition (CVPR), 2022
17 November 2022
Tim Brooks
Aleksander Holynski
Alexei A. Efros
    DiffM
ArXiv (abs)PDFHTMLHuggingFace (4 upvotes)

Papers citing "InstructPix2Pix: Learning to Follow Image Editing Instructions"

50 / 1,733 papers shown
Contrastive Diffusion Alignment: Learning Structured Latents for Controllable Generation
Contrastive Diffusion Alignment: Learning Structured Latents for Controllable Generation
Ruchi Sandilya
Sumaira Perez
Charles Lynch
Lindsay Victoria
Benjamin Zebley
...
Nolan Williams
Timothy J. Spellman
Faith M. Gunning
C. Liston
Logan Grosenick
175
0
0
16 Oct 2025
NANO3D: A Training-Free Approach for Efficient 3D Editing Without Masks
NANO3D: A Training-Free Approach for Efficient 3D Editing Without Masks
Junliang Ye
Shenghao Xie
R. Zhao
Zhengyi Wang
Hongyu Yan
Wenqiang Zu
Lei Ma
Jun Zhu
DiffM
202
4
0
16 Oct 2025
In-Context Learning with Unpaired Clips for Instruction-based Video Editing
In-Context Learning with Unpaired Clips for Instruction-based Video Editing
Xinyao Liao
Xianfang Zeng
Ziye Song
Zhoujie Fu
Gang Yu
Guosheng Lin
131
5
0
16 Oct 2025
Learning an Image Editing Model without Image Editing Pairs
Learning an Image Editing Model without Image Editing Pairs
Nupur Kumari
Sheng-Yu Wang
Nanxuan Zhao
Yotam Nitzan
Yuheng Li
Krishna Kumar Singh
Richard Zhang
Eli Shechtman
Jun-Yan Zhu
Xun Huang
DiffM
309
3
0
16 Oct 2025
Constantly Improving Image Models Need Constantly Improving Benchmarks
Constantly Improving Image Models Need Constantly Improving Benchmarks
Jiaxin Ge
Grace Luo
Heekyung Lee
Nishant Malpani
Long Lian
Xudong Wang
Aleksander Holynski
Trevor Darrell
Sewon Min
David M. Chan
VLM
111
0
0
16 Oct 2025
Adaptive Visual Conditioning for Semantic Consistency in Diffusion-Based Story Continuation
Adaptive Visual Conditioning for Semantic Consistency in Diffusion-Based Story Continuation
Seyed Mohammad Mousavi
Morteza Analoui
DiffM
125
0
0
15 Oct 2025
CanvasMAR: Improving Masked Autoregressive Video Generation With Canvas
CanvasMAR: Improving Masked Autoregressive Video Generation With Canvas
Zian Li
Muhan Zhang
DiffMVGen
156
0
0
15 Oct 2025
Edit-Your-Interest: Efficient Video Editing via Feature Most-Similar Propagation
Edit-Your-Interest: Efficient Video Editing via Feature Most-Similar Propagation
Yi Zuo
Zitao Wang
Lingling Li
Xu Liu
Fang Liu
Licheng Jiao
DiffMVGen
131
0
0
15 Oct 2025
Vectorized Video Representation with Easy Editing via Hierarchical Spatio-Temporally Consistent Proxy Embedding
Vectorized Video Representation with Easy Editing via Hierarchical Spatio-Temporally Consistent Proxy Embedding
Ye Chen
Liming Tan
Yupeng Zhu
Yuanbin Wang
Bingbing Ni
OCL
253
0
0
14 Oct 2025
SRUM: Fine-Grained Self-Rewarding for Unified Multimodal Models
SRUM: Fine-Grained Self-Rewarding for Unified Multimodal Models
Weiyang Jin
Yuwei Niu
Jiaqi Liao
Chengqi Duan
Aoxue Li
Shenghua Gao
Xihui Liu
LRM
208
4
0
14 Oct 2025
CoDefend: Cross-Modal Collaborative Defense via Diffusion Purification and Prompt Optimization
CoDefend: Cross-Modal Collaborative Defense via Diffusion Purification and Prompt Optimization
Fengling Zhu
Boshi Liu
Jingyu Hua
Sheng Zhong
DiffMAAML
114
0
0
13 Oct 2025
IVEBench: Modern Benchmark Suite for Instruction-Guided Video Editing Assessment
IVEBench: Modern Benchmark Suite for Instruction-Guided Video Editing Assessment
Yinan Chen
Jiangning Zhang
T. Hu
Yuxiang Zeng
Zhucun Xue
Qingdong He
Chengjie Wang
Y. Liu
Xiaobin Hu
Shuicheng Yan
107
6
0
13 Oct 2025
EditCast3D: Single-Frame-Guided 3D Editing with Video Propagation and View Selection
EditCast3D: Single-Frame-Guided 3D Editing with Video Propagation and View Selection
Huaizhi Qu
Ruichen Zhang
Shuqing Luo
Luchao Qi
Zhihao Zhang
Xiaoming Liu
Roni Sengupta
Tianlong Chen
DiffMVGen
139
0
0
11 Oct 2025
Color3D: Controllable and Consistent 3D Colorization with Personalized Colorizer
Color3D: Controllable and Consistent 3D Colorization with Personalized Colorizer
Yecong Wan
Mingwen Shao
Renlong Wu
Wangmeng Zuo
DiffM
134
0
0
11 Oct 2025
ReMix: Towards a Unified View of Consistent Character Generation and Editing
ReMix: Towards a Unified View of Consistent Character Generation and Editing
Benjia Zhou
Bin-Bin Fu
Pei Cheng
Y. Wang
Jiayuan Fan
Tao Chen
DiffM
118
0
0
11 Oct 2025
Mono4DEditor: Text-Driven 4D Scene Editing from Monocular Video via Point-Level Localization of Language-Embedded Gaussians
Mono4DEditor: Text-Driven 4D Scene Editing from Monocular Video via Point-Level Localization of Language-Embedded Gaussians
Jin-Chuan Shi
Chengye Su
Jiajun Wang
Ariel Shamir
Miao Wang
DiffM3DGSVGen
121
0
0
10 Oct 2025
InstructX: Towards Unified Visual Editing with MLLM Guidance
InstructX: Towards Unified Visual Editing with MLLM Guidance
Chong Mou
Qichao Sun
Yanze Wu
Pengze Zhang
Xinghui Li
Fulong Ye
Songtao Zhao
Qian He
MLLM
256
7
0
09 Oct 2025
Computationally-efficient Graph Modeling with Refined Graph Random Features
Computationally-efficient Graph Modeling with Refined Graph Random Features
K. Choromanski
Avinava Dubey
Arijit Sehanobish
Isaac Reid
117
0
0
09 Oct 2025
MATRIX: Multimodal Agent Tuning for Robust Tool-Use Reasoning
MATRIX: Multimodal Agent Tuning for Robust Tool-Use Reasoning
Tajamul Ashraf
Umair Nawaz
Abdelrahman M. Shaker
Rao Muhammad Anwer
Philip Torr
Fahad Shahbaz Khan
Salman Khan
227
0
0
09 Oct 2025
UniVideo: Unified Understanding, Generation, and Editing for Videos
UniVideo: Unified Understanding, Generation, and Editing for Videos
Cong Wei
Quande Liu
Zixuan Ye
Qiulin Wang
Xintao Wang
Pengfei Wan
Kun Gai
Wenhu Chen
VGen
262
14
0
09 Oct 2025
Beyond Textual CoT: Interleaved Text-Image Chains with Deep Confidence Reasoning for Image Editing
Beyond Textual CoT: Interleaved Text-Image Chains with Deep Confidence Reasoning for Image Editing
Zhentao Zou
Zhengrong Yue
Kunpeng Du
Binlei Bao
Hanting Li
...
Yue Zhou
Yali Wang
Jie Hu
Xue Jiang
X. Chen
LRM
183
0
0
09 Oct 2025
Kontinuous Kontext: Continuous Strength Control for Instruction-based Image Editing
Kontinuous Kontext: Continuous Strength Control for Instruction-based Image Editing
Rishubh Parihar
Or Patashnik
Daniil Ostashev
R. Venkatesh Babu
Daniel Cohen-Or
Kuan-Chieh Wang
148
0
0
09 Oct 2025
DreamOmni2: Multimodal Instruction-based Editing and Generation
DreamOmni2: Multimodal Instruction-based Editing and Generation
Bin Xia
Bohao Peng
Yuechen Zhang
Junjia Huang
Jiyang Liu
...
Chengyao Wang
Yitong Wang
Xinglong Wu
Bei Yu
Jiaya Jia
118
9
0
08 Oct 2025
Vision-Language-Action Models for Robotics: A Review Towards Real-World Applications
Vision-Language-Action Models for Robotics: A Review Towards Real-World ApplicationsIEEE Access (IEEE Access), 2025
Kento Kawaharazuka
Jihoon Oh
Jun Yamada
Ingmar Posner
Yuke Zhu
LM&Ro
273
27
0
08 Oct 2025
Ming-UniVision: Joint Image Understanding and Generation with a Unified Continuous Tokenizer
Ming-UniVision: Joint Image Understanding and Generation with a Unified Continuous Tokenizer
Ziyuan Huang
Dandan Zheng
Cheng Zou
Rui Liu
Xiaolong Wang
...
Jiajia Liu
Qingpei Guo
Ming-Hsuan Yang
Jingdong Chen
Jun Zhou
158
9
0
08 Oct 2025
Efficient High-Resolution Image Editing with Hallucination-Aware Loss and Adaptive Tiling
Efficient High-Resolution Image Editing with Hallucination-Aware Loss and Adaptive Tiling
Young D. Kwon
Abhinav Mehrotra
Malcolm Chadwick
Alberto Gil C. P. Ramos
S. Bhattacharya
DiffM
168
0
0
07 Oct 2025
TBStar-Edit: From Image Editing Pattern Shifting to Consistency Enhancement
TBStar-Edit: From Image Editing Pattern Shifting to Consistency Enhancement
Hao Fang
Zechao Zhan
Weixin Feng
Ziwei Huang
Xubin Li
Tiezheng Ge
DiffM
341
0
0
06 Oct 2025
Factuality Matters: When Image Generation and Editing Meet Structured Visuals
Factuality Matters: When Image Generation and Editing Meet Structured Visuals
Le Zhuo
Songhao Han
Yuandong Pu
Boxiang Qiu
Sayak Paul
...
Yihao Liu
Jie Shao
Xi Chen
Si Liu
Hongsheng Li
EGVM
244
2
0
06 Oct 2025
C3Editor: Achieving Controllable Consistency in 2D Model for 3D Editing
C3Editor: Achieving Controllable Consistency in 2D Model for 3D Editing
Zeng Tao
Zheng Ding
Zeyuan Chen
Xiang Zhang
Leizhi Li
Zhuowen Tu
DiffM
383
2
0
06 Oct 2025
ChronoEdit: Towards Temporal Reasoning for Image Editing and World Simulation
ChronoEdit: Towards Temporal Reasoning for Image Editing and World Simulation
J. Wu
Xuanchi Ren
Tianchang Shen
Tianshi Cao
Kai He
...
Jose M. Alvarez
Jun Gao
Sanja Fidler
Zian Wang
Huan Ling
DiffMVGen
234
3
0
05 Oct 2025
The Overlooked Value of Test-time Reference Sets in Visual Place Recognition
The Overlooked Value of Test-time Reference Sets in Visual Place Recognition
Mubariz Zaffar
Liangliang Nan
Sebastian Scherer
Julian F. P. Kooij
113
0
0
04 Oct 2025
Towards Scalable and Consistent 3D Editing
Towards Scalable and Consistent 3D Editing
Ruihao Xia
Yang Tang
Pan Zhou
DiffM
160
2
0
03 Oct 2025
Growing Visual Generative Capacity for Pre-Trained MLLMs
Growing Visual Generative Capacity for Pre-Trained MLLMs
Hanyu Wang
Jiaming Han
Ziyan Yang
Qi Zhao
Shanchuan Lin
Xiangyu Yue
Abhinav Shrivastava
Zhenheng Yang
Hao Chen
VLM
203
0
0
02 Oct 2025
Towards Better Optimization For Listwise Preference in Diffusion Models
Towards Better Optimization For Listwise Preference in Diffusion Models
Jiamu Bai
Xin Yu
Meilong Xu
Weitao Lu
Xin Pan
Kiwan Maeng
Daniel Kifer
Jian Wang
Yu Wang
EGVM
341
2
0
02 Oct 2025
FreeViS: Training-free Video Stylization with Inconsistent References
FreeViS: Training-free Video Stylization with Inconsistent References
Jiacong Xu
Yiqun Mei
Ke Zhang
Vishal M. Patel
DiffMVGen
208
2
0
02 Oct 2025
DragFlow: Unleashing DiT Priors with Region Based Supervision for Drag Editing
DragFlow: Unleashing DiT Priors with Region Based Supervision for Drag Editing
Zihan Zhou
Shilin Lu
Shuli Leng
Shaocong Zhang
Zhuming Lian
Xinlei Yu
A. Kong
DiffM
313
7
0
02 Oct 2025
Editable Noise Map Inversion: Encoding Target-image into Noise For High-Fidelity Image Manipulation
Editable Noise Map Inversion: Encoding Target-image into Noise For High-Fidelity Image Manipulation
Mingyu Kang
Yong Suk Choi
DiffM
228
0
0
30 Sep 2025
EditReward: A Human-Aligned Reward Model for Instruction-Guided Image Editing
EditReward: A Human-Aligned Reward Model for Instruction-Guided Image Editing
Keming Wu
Sicong Jiang
Max Ku
Ping Nie
Minghao Liu
Wenhu Chen
116
9
0
30 Sep 2025
Query-Kontext: An Unified Multimodal Model for Image Generation and Editing
Query-Kontext: An Unified Multimodal Model for Image Generation and Editing
Yuxin Song
Wenkai Dong
Shizun Wang
Qi Zhang
Song Xue
...
H. Yang
Haocheng Feng
Hang Zhou
Xinyan Xiao
Jingdong Wang
DiffMMLLM
153
5
0
30 Sep 2025
Dragging with Geometry: From Pixels to Geometry-Guided Image Editing
Dragging with Geometry: From Pixels to Geometry-Guided Image Editing
Xinyu Pu
Hongsong Wang
Jie Gui
Pan Zhou
DiffM
151
1
0
30 Sep 2025
LaTo: Landmark-tokenized Diffusion Transformer for Fine-grained Human Face Editing
LaTo: Landmark-tokenized Diffusion Transformer for Fine-grained Human Face Editing
Zhenghao Zhang
Ziying Zhang
Junchao Liao
Xiangyu Meng
Qiang Hu
Siyu Zhu
Xiaoyun Zhang
Long Qin
Weizhi Wang
144
0
0
30 Sep 2025
GaussEdit: Adaptive 3D Scene Editing with Text and Image Prompts
GaussEdit: Adaptive 3D Scene Editing with Text and Image PromptsIEEE Transactions on Visualization and Computer Graphics (TVCG), 2025
Zhenyu Shu
Junlong Yu
Kai Chao
Shiqing Xin
Ligang Liu
3DGS
202
3
0
30 Sep 2025
IMG: Calibrating Diffusion Models via Implicit Multimodal Guidance
IMG: Calibrating Diffusion Models via Implicit Multimodal Guidance
Jiayi Guo
Chuanhao Yan
Xingqian Xu
Yulin Wang
Kai Wang
Gao Huang
Humphrey Shi
143
1
0
30 Sep 2025
CharGen: Fast and Fluent Portrait Modification
CharGen: Fast and Fluent Portrait Modification
Jan-Niklas Dihlmann
Arnela Killguss
Hendrik P. A. Lensch
DiffM
108
0
0
29 Sep 2025
Instruction Guided Multi Object Image Editing with Quantity and Layout Consistency
Instruction Guided Multi Object Image Editing with Quantity and Layout Consistency
Jiaqi Tan
F. Li
Yang Liu
DiffM
109
0
0
29 Sep 2025
Environment-Aware Satellite Image Generation with Diffusion Models
Environment-Aware Satellite Image Generation with Diffusion Models
Nikos Kostagiolas
Pantelis Georgiades
Yannis Panagakis
M. Nicolaou
105
0
0
29 Sep 2025
Causal-Adapter: Taming Text-to-Image Diffusion for Faithful Counterfactual Generation
Causal-Adapter: Taming Text-to-Image Diffusion for Faithful Counterfactual Generation
Lei Tong
Zhihua Liu
Chaochao Lu
Dino Oglic
Tom Diethe
Philip Teare
Sotirios A. Tsaftaris
Chen Jin
DiffMCML
284
1
0
29 Sep 2025
SCOPE: Semantic Conditioning for Sim2Real Category-Level Object Pose Estimation in Robotics
SCOPE: Semantic Conditioning for Sim2Real Category-Level Object Pose Estimation in Robotics
Peter Honig
S. Thalhammer
Jean-Baptiste Weibel
Matthias Hirschmanner
Markus Vincze
138
0
0
29 Sep 2025
ReLumix: Extending Image Relighting to Video via Video Diffusion Models
ReLumix: Extending Image Relighting to Video via Video Diffusion Models
Lezhong Wang
Shutong Jin
Ruiqi Cui
Anders Bjorholm Dahl
J. Frisvad
Siavash Bigdeli
VGen
124
0
0
28 Sep 2025
VMDiff: Visual Mixing Diffusion for Limitless Cross-Object Synthesis
VMDiff: Visual Mixing Diffusion for Limitless Cross-Object Synthesis
Zeren Xiong
Yue Yu
Zedong Zhang
Shuo Chen
J. Yang
Jun Yu Li
DiffM
159
0
0
28 Sep 2025
Previous
123456...333435
Next
Page 3 of 35
Pageof 35