ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2211.09800
  4. Cited By
InstructPix2Pix: Learning to Follow Image Editing Instructions
v1v2 (latest)

InstructPix2Pix: Learning to Follow Image Editing Instructions

Computer Vision and Pattern Recognition (CVPR), 2022
17 November 2022
Tim Brooks
Aleksander Holynski
Alexei A. Efros
    DiffM
ArXiv (abs)PDFHTMLHuggingFace (4 upvotes)

Papers citing "InstructPix2Pix: Learning to Follow Image Editing Instructions"

50 / 1,733 papers shown
GPT4Tools: Teaching Large Language Model to Use Tools via
  Self-instruction
GPT4Tools: Teaching Large Language Model to Use Tools via Self-instructionNeural Information Processing Systems (NeurIPS), 2023
Rui Yang
Lin Song
Yanwei Li
Sijie Zhao
Yixiao Ge
Xiu Li
Ying Shan
SyDaMLLM
284
294
0
30 May 2023
Real-World Image Variation by Aligning Diffusion Inversion Chain
Real-World Image Variation by Aligning Diffusion Inversion ChainNeural Information Processing Systems (NeurIPS), 2023
Yuechen Zhang
Jinbo Xing
Eric Lo
Jiaya Jia
336
42
0
30 May 2023
Controllable Text-to-Image Generation with GPT-4
Controllable Text-to-Image Generation with GPT-4
Tianjun Zhang
Yi Zhang
Vibhav Vineet
Neel Joshi
Xin Eric Wang
DiffM
347
61
0
29 May 2023
InstructEdit: Improving Automatic Masks for Diffusion-based Image
  Editing With User Instructions
InstructEdit: Improving Automatic Masks for Diffusion-based Image Editing With User Instructions
Qian Wang
Biao Zhang
Michael Birsak
Peter Wonka
DiffM
249
56
0
29 May 2023
FuseCap: Leveraging Large Language Models for Enriched Fused Image
  Captions
FuseCap: Leveraging Large Language Models for Enriched Fused Image CaptionsIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2023
Noam Rotstein
David Bensaid
Shaked Brody
Roy Ganz
Ron Kimmel
VLM
392
53
0
28 May 2023
Accelerating Text-to-Image Editing via Cache-Enabled Sparse Diffusion
  Inference
Accelerating Text-to-Image Editing via Cache-Enabled Sparse Diffusion InferenceAAAI Conference on Artificial Intelligence (AAAI), 2023
Zihao Yu
Haoyang Li
Fangcheng Fu
Xupeng Miao
Tengjiao Wang
DiffM
312
16
0
27 May 2023
CRoSS: Diffusion Model Makes Controllable, Robust and Secure Image
  Steganography
CRoSS: Diffusion Model Makes Controllable, Robust and Secure Image SteganographyNeural Information Processing Systems (NeurIPS), 2023
Jiwen Yu
Xuanyu Zhang
You-song Xu
Jian Zhang
DiffM
314
95
0
26 May 2023
Uni-ControlNet: All-in-One Control to Text-to-Image Diffusion Models
Uni-ControlNet: All-in-One Control to Text-to-Image Diffusion ModelsNeural Information Processing Systems (NeurIPS), 2023
Shihao Zhao
Dongdong Chen
Yen-Chun Chen
Jianmin Bao
Shaozhe Hao
Lu Yuan
Kwan-Yee K. Wong
424
392
0
25 May 2023
Break-A-Scene: Extracting Multiple Concepts from a Single Image
Break-A-Scene: Extracting Multiple Concepts from a Single ImageACM SIGGRAPH Conference and Exhibition on Computer Graphics and Interactive Techniques in Asia (SIGGRAPH Asia), 2023
Omri Avrahami
Kfir Aberman
Ohad Fried
Daniel Cohen-Or
Dani Lischinski
VLMDiffM
253
241
0
25 May 2023
Diversify Your Vision Datasets with Automatic Diffusion-Based
  Augmentation
Diversify Your Vision Datasets with Automatic Diffusion-Based AugmentationNeural Information Processing Systems (NeurIPS), 2023
Lisa Dunlap
Alyssa Umino
Han Zhang
Jiezhi Yang
Joseph E. Gonzalez
Trevor Darrell
DiffM
364
111
0
25 May 2023
CommonScenes: Generating Commonsense 3D Indoor Scenes with Scene Graph
  Diffusion
CommonScenes: Generating Commonsense 3D Indoor Scenes with Scene Graph Diffusion
Guangyao Zhai
Evin Pınar Örnek
Shun-cheng Wu
Yan Di
F. Tombari
Nassir Navab
Benjamin Busam
DiffM
353
31
0
25 May 2023
ProSpect: Prompt Spectrum for Attribute-Aware Personalization of
  Diffusion Models
ProSpect: Prompt Spectrum for Attribute-Aware Personalization of Diffusion ModelsACM Transactions on Graphics (TOG), 2023
Yuxin Zhang
Weiming Dong
Fan Tang
Nisha Huang
Haibin Huang
Chongyang Ma
Tong-Yee Lee
Oliver Deussen
Changsheng Xu
DiffM
433
120
0
25 May 2023
Towards Language-guided Interactive 3D Generation: LLMs as Layout
  Interpreter with Generative Feedback
Towards Language-guided Interactive 3D Generation: LLMs as Layout Interpreter with Generative Feedback
Yiqi Lin
Hao Wu
Ruichen Wang
H. Lu
Xiaodong Lin
Hui Xiong
Lin Wang
3DV
180
17
0
25 May 2023
Balancing the Picture: Debiasing Vision-Language Datasets with Synthetic
  Contrast Sets
Balancing the Picture: Debiasing Vision-Language Datasets with Synthetic Contrast Sets
Brandon Smith
Miguel Farinha
Elizaveta Semenova
Hannah Rose Kirk
Aleksandar Shtedritski
Max Bain
259
26
0
24 May 2023
LayoutGPT: Compositional Visual Planning and Generation with Large
  Language Models
LayoutGPT: Compositional Visual Planning and Generation with Large Language ModelsNeural Information Processing Systems (NeurIPS), 2023
Weixi Feng
Wanrong Zhu
Tsu-Jui Fu
Varun Jampani
Arjun Reddy Akula
Xuehai He
Sugato Basu
Xinze Wang
William Yang Wang
MLLM
512
293
0
24 May 2023
InNeRF360: Text-Guided 3D-Consistent Object Inpainting on 360-degree
  Neural Radiance Fields
InNeRF360: Text-Guided 3D-Consistent Object Inpainting on 360-degree Neural Radiance FieldsComputer Vision and Pattern Recognition (CVPR), 2023
Dongqing Wang
Tong Zhang
Alaa Abboud
Sabine Süsstrunk
186
17
0
24 May 2023
ChatFace: Chat-Guided Real Face Editing via Diffusion Latent Space
  Manipulation
ChatFace: Chat-Guided Real Face Editing via Diffusion Latent Space Manipulation
Dongxu Yue
Qin Guo
Munan Ning
Jiaxi Cui
Yuesheng Zhu
Liuliang Yuan
DiffM
318
15
0
24 May 2023
I Spy a Metaphor: Large Language Models and Diffusion Models Co-Create
  Visual Metaphors
I Spy a Metaphor: Large Language Models and Diffusion Models Co-Create Visual MetaphorsAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Tuhin Chakrabarty
Arkadiy Saakyan
Olivia Winn
Artemis Panagopoulou
Yue Yang
Marianna Apidianaki
Smaranda Muresan
DiffM
220
62
0
24 May 2023
BLIP-Diffusion: Pre-trained Subject Representation for Controllable
  Text-to-Image Generation and Editing
BLIP-Diffusion: Pre-trained Subject Representation for Controllable Text-to-Image Generation and EditingNeural Information Processing Systems (NeurIPS), 2023
Dongxu Li
Junnan Li
Steven C. H. Hoi
434
467
0
24 May 2023
Vision + Language Applications: A Survey
Vision + Language Applications: A Survey
Yutong Zhou
N. Shimada
VLM
277
13
0
24 May 2023
Image Manipulation via Multi-Hop Instructions -- A New Dataset and
  Weakly-Supervised Neuro-Symbolic Approach
Image Manipulation via Multi-Hop Instructions -- A New Dataset and Weakly-Supervised Neuro-Symbolic ApproachConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Harman Singh
Poorva Garg
M. Gupta
Kevin Shah
Ashish Goswami
A. Mondal
Arnab Kumar Mondal
Dinesh Khandelwal
Dinesh Garg
Parag Singla
LM&Ro
155
2
0
23 May 2023
DirecT2V: Large Language Models are Frame-Level Directors for Zero-Shot
  Text-to-Video Generation
DirecT2V: Large Language Models are Frame-Level Directors for Zero-Shot Text-to-Video Generation
Susung Hong
Junyoung Seo
Heeseong Shin
Sung‐Jin Hong
Seung Wook Kim
DiffMVGen
294
54
0
23 May 2023
Control-A-Video: Controllable Text-to-Video Generation with Diffusion
  Models
Control-A-Video: Controllable Text-to-Video Generation with Diffusion Models
Weifeng Chen
Yatai Ji
Jie Wu
Hefeng Wu
Pan Xie
Jiashi Li
Xin Xia
Xuefeng Xiao
Liang Lin
VGen
382
92
0
23 May 2023
LLM-grounded Diffusion: Enhancing Prompt Understanding of Text-to-Image
  Diffusion Models with Large Language Models
LLM-grounded Diffusion: Enhancing Prompt Understanding of Text-to-Image Diffusion Models with Large Language Models
Long Lian
Boyi Li
Adam Yala
Trevor Darrell
340
220
0
23 May 2023
Interactive Data Synthesis for Systematic Vision Adaptation via
  LLMs-AIGCs Collaboration
Interactive Data Synthesis for Systematic Vision Adaptation via LLMs-AIGCs Collaboration
Qifan Yu
Juncheng Li
Wentao Ye
Siliang Tang
Yueting Zhuang
208
14
0
22 May 2023
The CLIP Model is Secretly an Image-to-Prompt Converter
The CLIP Model is Secretly an Image-to-Prompt ConverterNeural Information Processing Systems (NeurIPS), 2023
Yuxuan Ding
Chunna Tian
Haoxuan Ding
Lingqiao Liu
DiffM
150
17
0
22 May 2023
Guided Motion Diffusion for Controllable Human Motion Synthesis
Guided Motion Diffusion for Controllable Human Motion SynthesisIEEE International Conference on Computer Vision (ICCV), 2023
Korrawe Karunratanakul
Konpat Preechakul
Supasorn Suwajanakorn
Siyu Tang
DiffM
445
206
0
21 May 2023
InstructVid2Vid: Controllable Video Editing with Natural Language
  Instructions
InstructVid2Vid: Controllable Video Editing with Natural Language InstructionsIEEE International Conference on Multimedia and Expo (ICME), 2023
Bosheng Qin
Juncheng Li
Siliang Tang
Tat-Seng Chua
Yueting Zhuang
VGenDiffM
272
31
0
21 May 2023
Chupa: Carving 3D Clothed Humans from Skinned Shape Priors using 2D
  Diffusion Probabilistic Models
Chupa: Carving 3D Clothed Humans from Skinned Shape Priors using 2D Diffusion Probabilistic ModelsIEEE International Conference on Computer Vision (ICCV), 2023
Byungjun Kim
Patrick Kwon
K. Lee
Myunggi Lee
Sookwan Han
Daesik Kim
Hanbyul Joo
DiffM
399
26
0
19 May 2023
RoomDreamer: Text-Driven 3D Indoor Scene Synthesis with Coherent
  Geometry and Texture
RoomDreamer: Text-Driven 3D Indoor Scene Synthesis with Coherent Geometry and TextureACM Multimedia (ACM MM), 2023
Liangchen Song
Liangliang Cao
Hongyu Xu
Kai Kang
Feng Tang
Junsong Yuan
Yang Zhao
VGenDiffM
236
59
0
18 May 2023
LLMScore: Unveiling the Power of Large Language Models in Text-to-Image
  Synthesis Evaluation
LLMScore: Unveiling the Power of Large Language Models in Text-to-Image Synthesis EvaluationNeural Information Processing Systems (NeurIPS), 2023
Yujie Lu
Xianjun Yang
Xiujun Li
Xinze Wang
William Yang Wang
EGVM
430
99
0
18 May 2023
DiffUTE: Universal Text Editing Diffusion Model
DiffUTE: Universal Text Editing Diffusion ModelNeural Information Processing Systems (NeurIPS), 2023
Haoxing Chen
Zhuoer Xu
Zhangxuan Gu
Jun Lan
Xing Zheng
Yaohui Li
Changhua Meng
Huijia Zhu
Weiqiang Wang
DiffM
328
46
0
18 May 2023
Preserve Your Own Correlation: A Noise Prior for Video Diffusion Models
Preserve Your Own Correlation: A Noise Prior for Video Diffusion ModelsIEEE International Conference on Computer Vision (ICCV), 2023
Songwei Ge
Seungjun Nah
Guilin Liu
Tyler Poon
Andrew Tao
Bryan Catanzaro
David Jacobs
Jia-Bin Huang
Ming-Yuan Liu
Yogesh Balaji
DiffMVGen
391
299
0
17 May 2023
Face Recognition Using Synthetic Face Data
Face Recognition Using Synthetic Face Data
Omer Granoviter
Alexey Gruzdev
V. Loginov
Max Kogan
Orly Zvitia
186
1
0
17 May 2023
Make-A-Protagonist: Generic Video Editing with An Ensemble of Experts
Make-A-Protagonist: Generic Video Editing with An Ensemble of Experts
Yuyang Zhao
Enze Xie
Lanqing Hong
Zhenguo Li
G. Lee
DiffMVGen
196
41
0
15 May 2023
Generative AI meets 3D: A Survey on Text-to-3D in AIGC Era
Generative AI meets 3D: A Survey on Text-to-3D in AIGC Era
Chenghao Li
Chaoning Zhang
Atish Waghwase
Lik-Hang Lee
François Rameau
Yang Yang
Sung-Ho Bae
Choong Seon Hong
257
96
0
10 May 2023
iEdit: Localised Text-guided Image Editing with Weak Supervision
iEdit: Localised Text-guided Image Editing with Weak Supervision
Rumeysa Bodur
Erhan Gundogdu
Binod Bhattarai
Tae-Kyun Kim
M. Donoser
Loris Bazzani
DiffM
200
20
0
10 May 2023
Text-guided High-definition Consistency Texture Model
Text-guided High-definition Consistency Texture Model
Zhibin Tang
Tiantong He
DiffM
121
6
0
10 May 2023
Style-A-Video: Agile Diffusion for Arbitrary Text-based Video Style
  Transfer
Style-A-Video: Agile Diffusion for Arbitrary Text-based Video Style TransferIEEE Signal Processing Letters (IEEE SPL), 2023
Nisha Huang
Yuxin Zhang
Weiming Dong
DiffMVGen
235
25
0
09 May 2023
ReGeneration Learning of Diffusion Models with Rich Prompts for
  Zero-Shot Image Translation
ReGeneration Learning of Diffusion Models with Rich Prompts for Zero-Shot Image Translation
Yupei Lin
Senyang Zhang
Xiaojun Yang
Tianlin Li
Yukai Shi
DiffM
144
7
0
08 May 2023
AADiff: Audio-Aligned Video Synthesis with Text-to-Image Diffusion
AADiff: Audio-Aligned Video Synthesis with Text-to-Image Diffusion
Seungwoo Lee
Chaerin Kong
D. Jeon
Nojun Kwak
DiffM
290
24
0
06 May 2023
DisenBooth: Identity-Preserving Disentangled Tuning for Subject-Driven
  Text-to-Image Generation
DisenBooth: Identity-Preserving Disentangled Tuning for Subject-Driven Text-to-Image GenerationInternational Conference on Learning Representations (ICLR), 2023
Hong Chen
Yipeng Zhang
Simin Wu
Xin Eric Wang
Xuguang Duan
Yuwei Zhou
Wenwu Zhu
DiffM
354
73
0
05 May 2023
Multimodal Procedural Planning via Dual Text-Image Prompting
Multimodal Procedural Planning via Dual Text-Image PromptingConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Yujie Lu
Pan Lu
Zhiyu Zoey Chen
Wanrong Zhu
Xinze Wang
William Yang Wang
LM&Ro
234
53
0
02 May 2023
Key-Locked Rank One Editing for Text-to-Image Personalization
Key-Locked Rank One Editing for Text-to-Image PersonalizationInternational Conference on Computer Graphics and Interactive Techniques (SIGGRAPH), 2023
Yoad Tewel
Rinon Gal
Gal Chechik
Yuval Atzmon
DiffM
435
218
0
02 May 2023
In-Context Learning Unlocked for Diffusion Models
In-Context Learning Unlocked for Diffusion ModelsNeural Information Processing Systems (NeurIPS), 2023
Zhendong Wang
Lezhi Li
Yadong Lu
Yelong Shen
Pengcheng He
Weizhu Chen
Zinan Lin
Mingyuan Zhou
VLMDiffM
338
97
0
01 May 2023
Let the Chart Spark: Embedding Semantic Context into Chart with
  Text-to-Image Generative Model
Let the Chart Spark: Embedding Semantic Context into Chart with Text-to-Image Generative ModelIEEE Transactions on Visualization and Computer Graphics (TVCG), 2023
Shishi Xiao
Suizi Huang
Yue Lin
Yilin Ye
Weizhen Zeng
346
47
0
28 Apr 2023
IconShop: Text-Guided Vector Icon Synthesis with Autoregressive
  Transformers
IconShop: Text-Guided Vector Icon Synthesis with Autoregressive TransformersACM Transactions on Graphics (TOG), 2023
Rong Wu
Wanchao Su
Kede Ma
Jing Liao
490
63
0
27 Apr 2023
Learning Human-Human Interactions in Images from Weak Textual
  Supervision
Learning Human-Human Interactions in Images from Weak Textual SupervisionIEEE International Conference on Computer Vision (ICCV), 2023
Morris Alper
Hadar Averbuch-Elor
VLM
385
3
0
27 Apr 2023
Patch Diffusion: Faster and More Data-Efficient Training of Diffusion
  Models
Patch Diffusion: Faster and More Data-Efficient Training of Diffusion ModelsNeural Information Processing Systems (NeurIPS), 2023
Zhendong Wang
Lezhi Li
Huangjie Zheng
Peihao Wang
Pengcheng He
Zinan Lin
Weizhu Chen
Mingyuan Zhou
256
160
0
25 Apr 2023
SINC: Spatial Composition of 3D Human Motions for Simultaneous Action
  Generation
SINC: Spatial Composition of 3D Human Motions for Simultaneous Action GenerationIEEE International Conference on Computer Vision (ICCV), 2023
Nikos Athanasiou
Mathis Petrovich
Michael J. Black
Gül Varol
443
61
0
20 Apr 2023
Previous
123...32333435
Next
Page 33 of 35
Pageof 35