v1v2 (latest)

InstructPix2Pix: Learning to Follow Image Editing Instructions

Computer Vision and Pattern Recognition (CVPR), 2022

17 November 2022

Tim Brooks

Aleksander Holynski

Alexei A. Efros

DiffM

ArXiv (abs)PDF HTML HuggingFace (4 upvotes)

Papers citing "InstructPix2Pix: Learning to Follow Image Editing Instructions"

50 / 1,733 papers shown

GPT4Tools: Teaching Large Language Model to Use Tools via Self-instructionNeural Information Processing Systems (NeurIPS), 2023

Sijie Zhao

Ying Shan

284

294

30 May 2023

Real-World Image Variation by Aligning Diffusion Inversion ChainNeural Information Processing Systems (NeurIPS), 2023

336

30 May 2023

Controllable Text-to-Image Generation with GPT-4

Tianjun Zhang

347

29 May 2023

InstructEdit: Improving Automatic Masks for Diffusion-based Image Editing With User Instructions

249

29 May 2023

FuseCap: Leveraging Large Language Models for Enriched Fused Image CaptionsIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2023

392

28 May 2023

Accelerating Text-to-Image Editing via Cache-Enabled Sparse Diffusion InferenceAAAI Conference on Artificial Intelligence (AAAI), 2023

312

27 May 2023

CRoSS: Diffusion Model Makes Controllable, Robust and Secure Image SteganographyNeural Information Processing Systems (NeurIPS), 2023

Jian Zhang

314

26 May 2023

Uni-ControlNet: All-in-One Control to Text-to-Image Diffusion ModelsNeural Information Processing Systems (NeurIPS), 2023

Jianmin Bao

Lu Yuan

424

392

25 May 2023

Break-A-Scene: Extracting Multiple Concepts from a Single ImageACM SIGGRAPH Conference and Exhibition on Computer Graphics and Interactive Techniques in Asia (SIGGRAPH Asia), 2023

Kfir Aberman

Daniel Cohen-Or

253

241

25 May 2023

Diversify Your Vision Datasets with Automatic Diffusion-Based AugmentationNeural Information Processing Systems (NeurIPS), 2023

364

111

25 May 2023

CommonScenes: Generating Commonsense 3D Indoor Scenes with Scene Graph Diffusion

Nassir Navab

353

25 May 2023

ProSpect: Prompt Spectrum for Attribute-Aware Personalization of Diffusion ModelsACM Transactions on Graphics (TOG), 2023

433

120

25 May 2023

Towards Language-guided Interactive 3D Generation: LLMs as Layout Interpreter with Generative Feedback

Lin Wang

180

25 May 2023

Balancing the Picture: Debiasing Vision-Language Datasets with Synthetic Contrast Sets

Aleksandar Shtedritski

Max Bain

259

24 May 2023

LayoutGPT: Compositional Visual Planning and Generation with Large Language ModelsNeural Information Processing Systems (NeurIPS), 2023

512

293

24 May 2023

InNeRF360: Text-Guided 3D-Consistent Object Inpainting on 360-degree Neural Radiance FieldsComputer Vision and Pattern Recognition (CVPR), 2023

Dongqing Wang

Tong Zhang

Alaa Abboud

Sabine Süsstrunk

186

24 May 2023

ChatFace: Chat-Guided Real Face Editing via Diffusion Latent Space Manipulation

318

24 May 2023

I Spy a Metaphor: Large Language Models and Diffusion Models Co-Create Visual MetaphorsAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

Marianna Apidianaki

Smaranda Muresan

DiffM

220

24 May 2023

BLIP-Diffusion: Pre-trained Subject Representation for Controllable Text-to-Image Generation and EditingNeural Information Processing Systems (NeurIPS), 2023

Dongxu Li

Junnan Li

Steven C. H. Hoi

434

467

24 May 2023

Vision + Language Applications: A Survey

Yutong Zhou

N. Shimada

VLM

277

24 May 2023

Image Manipulation via Multi-Hop Instructions -- A New Dataset and Weakly-Supervised Neuro-Symbolic ApproachConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

155

23 May 2023

DirecT2V: Large Language Models are Frame-Level Directors for Zero-Shot Text-to-Video Generation

294

23 May 2023

Control-A-Video: Controllable Text-to-Video Generation with Diffusion Models

Hefeng Wu

382

23 May 2023

LLM-grounded Diffusion: Enhancing Prompt Understanding of Text-to-Image Diffusion Models with Large Language Models

Long Lian

Boyi Li

Adam Yala

Trevor Darrell

340

220

23 May 2023

Interactive Data Synthesis for Systematic Vision Adaptation via LLMs-AIGCs Collaboration

208

22 May 2023

The CLIP Model is Secretly an Image-to-Prompt ConverterNeural Information Processing Systems (NeurIPS), 2023

Yuxuan Ding

Chunna Tian

Haoxuan Ding

Lingqiao Liu

DiffM

150

22 May 2023

Guided Motion Diffusion for Controllable Human Motion SynthesisIEEE International Conference on Computer Vision (ICCV), 2023

Korrawe Karunratanakul

Konpat Preechakul

Supasorn Suwajanakorn

Siyu Tang

DiffM

445

206

21 May 2023

InstructVid2Vid: Controllable Video Editing with Natural Language InstructionsIEEE International Conference on Multimedia and Expo (ICME), 2023

272

21 May 2023

Chupa: Carving 3D Clothed Humans from Skinned Shape Priors using 2D Diffusion Probabilistic ModelsIEEE International Conference on Computer Vision (ICCV), 2023

399

19 May 2023

RoomDreamer: Text-Driven 3D Indoor Scene Synthesis with Coherent Geometry and TextureACM Multimedia (ACM MM), 2023

236

18 May 2023

LLMScore: Unveiling the Power of Large Language Models in Text-to-Image Synthesis EvaluationNeural Information Processing Systems (NeurIPS), 2023

430

18 May 2023

DiffUTE: Universal Text Editing Diffusion ModelNeural Information Processing Systems (NeurIPS), 2023

328

18 May 2023

Preserve Your Own Correlation: A Noise Prior for Video Diffusion ModelsIEEE International Conference on Computer Vision (ICCV), 2023

391

299

17 May 2023

Face Recognition Using Synthetic Face Data

186

17 May 2023

Make-A-Protagonist: Generic Video Editing with An Ensemble of Experts

Lanqing Hong

196

15 May 2023

Generative AI meets 3D: A Survey on Text-to-3D in AIGC Era

Lik-Hang Lee

257

10 May 2023

iEdit: Localised Text-guided Image Editing with Weak Supervision

200

10 May 2023

Text-guided High-definition Consistency Texture Model

Zhibin Tang

Tiantong He

DiffM

121

10 May 2023

Style-A-Video: Agile Diffusion for Arbitrary Text-based Video Style TransferIEEE Signal Processing Letters (IEEE SPL), 2023

235

09 May 2023

ReGeneration Learning of Diffusion Models with Rich Prompts for Zero-Shot Image Translation

144

08 May 2023

AADiff: Audio-Aligned Video Synthesis with Text-to-Image Diffusion

Nojun Kwak

290

06 May 2023

DisenBooth: Identity-Preserving Disentangled Tuning for Subject-Driven Text-to-Image GenerationInternational Conference on Learning Representations (ICLR), 2023

354

05 May 2023

Multimodal Procedural Planning via Dual Text-Image PromptingConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

234

02 May 2023

Key-Locked Rank One Editing for Text-to-Image PersonalizationInternational Conference on Computer Graphics and Interactive Techniques (SIGGRAPH), 2023

435

218

02 May 2023

In-Context Learning Unlocked for Diffusion ModelsNeural Information Processing Systems (NeurIPS), 2023

Mingyuan Zhou

338

01 May 2023

Let the Chart Spark: Embedding Semantic Context into Chart with Text-to-Image Generative ModelIEEE Transactions on Visualization and Computer Graphics (TVCG), 2023

346

28 Apr 2023

IconShop: Text-Guided Vector Icon Synthesis with Autoregressive TransformersACM Transactions on Graphics (TOG), 2023

490

27 Apr 2023

Learning Human-Human Interactions in Images from Weak Textual SupervisionIEEE International Conference on Computer Vision (ICCV), 2023

Morris Alper

Hadar Averbuch-Elor

VLM

385

27 Apr 2023

Patch Diffusion: Faster and More Data-Efficient Training of Diffusion ModelsNeural Information Processing Systems (NeurIPS), 2023

Huangjie Zheng

Mingyuan Zhou

256

160

25 Apr 2023

SINC: Spatial Composition of 3D Human Motions for Simultaneous Action GenerationIEEE International Conference on Computer Vision (ICCV), 2023

443

20 Apr 2023