v1v2 (latest)

InstructPix2Pix: Learning to Follow Image Editing Instructions

Computer Vision and Pattern Recognition (CVPR), 2022

17 November 2022

Tim Brooks

Aleksander Holynski

Alexei A. Efros

DiffM

ArXiv (abs)PDF HTML HuggingFace (4 upvotes)

Papers citing "InstructPix2Pix: Learning to Follow Image Editing Instructions"

50 / 1,731 papers shown

EditScore: Unlocking Online RL for Image Editing via High-Fidelity Reward Modeling

197

28 Sep 2025

LABELING COPILOT: A Deep Research Agent for Automated Data Curation in Computer Vision

Debargha Ganguly

Sumit Kumar

Ishwar B Balappanawar

Weicong Chen

Shashank Kambhatla

Srinivasan Iyengar

Shivkumar Kalyanaraman

Ponnurangam Kumaraguru

Vipin Chaudhary

VLM

186

26 Sep 2025

Training-Free Synthetic Data Generation with Dual IP-Adapter Guidance

113

26 Sep 2025

SAGE: Scene Graph-Aware Guidance and Execution for Long-Horizon Manipulation Tasks

132

26 Sep 2025

FlashEdit: Decoupling Speed, Structure, and Semantics for Precise Image Editing

244

26 Sep 2025

TDEdit: A Unified Diffusion Framework for Text-Drag Guided Image Manipulation

104

26 Sep 2025

Does FLUX Already Know How to Perform Physically Plausible Image Composition?

314

25 Sep 2025

UniTransfer: Video Concept Transfer via Progressive Spatial and Timestep Decomposition

154

25 Sep 2025

Guiding Audio Editing with Audio Language Model

169

25 Sep 2025

Evaluating the Evaluators: Metrics for Compositional Text-to-Image Generation

Mahdieh Soleymani Baghshah

M. Rohban

EGVM

247

25 Sep 2025

CAMILA: Context-Aware Masking for Image Editing with Language Alignment

Hyunseung Kim

Chiho Choi

Srikanth Malla

Sai Prahladh Padmanabhan

Saurabh Bagchi

Joon Hee Choi

289

24 Sep 2025

Unleashing the Potential of the Semantic Latent Space in Diffusion Models for Image DehazingEuropean Conference on Computer Vision (ECCV), 2025

222

24 Sep 2025

Towards Application Aligned Synthetic Surgical Image Synthesis

Danush Kumar Venkatesh

Stefanie Speidel

MedIm

128

23 Sep 2025

One-shot Embroidery Customization via Contrastive LoRA Modulation

194

23 Sep 2025

Prompt-Guided Dual Latent Steering for Inversion Problems

186

23 Sep 2025

OmniBridge: Unified Multimodal Understanding, Generation, and Retrieval via Latent Space Alignment

Teng Xiao

Zuchao Li

Lefei Zhang

182

23 Sep 2025

Hyper-Bagel: A Unified Acceleration Framework for Multimodal Understanding and Generation

191

23 Sep 2025

Text Slider: Efficient and Plug-and-Play Continuous Concept Control for Image/Video Synthesis via LoRA Adapters

131

23 Sep 2025

Seg4Diff: Unveiling Open-Vocabulary Segmentation in Text-to-Image Diffusion Transformers

195

22 Sep 2025

CARINOX: Inference-time Scaling with Category-Aware Reward-based Initial Noise Optimization and Exploration

248

22 Sep 2025

Degradation-Aware All-in-One Image Restoration via Latent Prior Encoding

171

22 Sep 2025

Follow-Your-Emoji-Faster: Towards Efficient, Fine-Controllable, and Expressive Freestyle Portrait Animation

...

267

20 Sep 2025

A Vision-Language-Action-Critic Model for Robotic Real-World Reinforcement Learning

193

19 Sep 2025

UnifiedVisual: A Framework for Constructing Unified Vision-Language Datasets

141

18 Sep 2025

AutoEdit: Automatic Hyperparameter Tuning for Image Editing

189

18 Sep 2025

MultiEdit: Advancing Instruction-based Image Editing on Diverse and Challenging Tasks

144

18 Sep 2025

Controllable-Continuous Color Editing in Diffusion Model via Color Mapping

148

17 Sep 2025

EdiVal-Agent: An Object-Centric Framework for Automated, Fine-Grained Evaluation of Multi-Turn Editing

...

143

16 Sep 2025

Lego-Edit: A General Image Editing Framework with Model-Level Bricks and MLLM Builder

126

16 Sep 2025

HoloGarment: 360° Novel View Synthesis of In-the-Wild Garments

J. Karras

Yingwei Li

Yasamin Jafarian

Ira Kemelmacher-Shlizerman

133

15 Sep 2025

Robust Concept Erasure in Diffusion Models: A Theoretical Perspective on Security and Robustness

316

15 Sep 2025

Mask Consistency Regularization in Object Removal

138

12 Sep 2025

Fine-Grained Customized Fashion Design with Image-into-Prompt benchmark and dataset from LMM

11 Sep 2025

Target-oriented Multimodal Sentiment Classification with Counterfactual-enhanced Debiasing

Zhiyue Liu

Fanrong Ma

Xin Ling

11 Sep 2025

Prompt-Driven Image Analysis with Multimodal Generative AI: Detection, Segmentation, Inpainting, and Interpretation

Kaleem Ahmad

MLLM

10 Sep 2025

Imagining Alternatives: Towards High-Resolution 3D Counterfactual Medical Image Generation via Language Guidance

153

07 Sep 2025

OmniStyle2: Scalable and High Quality Artistic Style Transfer Data Generation via Destylization

133

07 Sep 2025

AURAD: Anatomy-Pathology Unified Radiology Synthesis with Progressive Representations

198

05 Sep 2025

Plotñ Polish: Zero-shot Story Visualization and Disentangled Editing with Text-to-Image Diffusion Models

131

04 Sep 2025

Inpaint4Drag: Repurposing Inpainting Models for Drag-Based Image Editing via Bidirectional Warping

Jingyi Lu

Kai Han

DiffM

190

04 Sep 2025

SSGaussian: Semantic-Aware and Structure-Preserving 3D Style Transfer

131

04 Sep 2025

From Editor to Dense Geometry Estimator

212

04 Sep 2025

Skywork UniPic 2.0: Building Kontext Model with Online RL for Unified Multimodal Model

...

182

04 Sep 2025

Durian: Dual Reference Image-Guided Portrait Animation with Attribute Transfer

138

04 Sep 2025

OneCAT: Decoder-Only Auto-Regressive Model for Unified Understanding and Generation

365

03 Sep 2025

Draw-In-Mind: Rebalancing Designer-Painter Roles in Unified Multimodal Models Benefits Image Editing

179

02 Sep 2025

Exploring Diffusion Models for Generative Forecasting of Financial Charts

02 Sep 2025

Discrete Noise Inversion for Next-scale Autoregressive Text-based Image Editing

213

02 Sep 2025

Category-Aware 3D Object Composition with Disentangled Texture and Shape Multi-view Diffusion

164

02 Sep 2025

PRINTER:Deformation-Aware Adversarial Learning for Virtual IHC Staining with In Situ Fidelity

01 Sep 2025