Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2211.09800
Cited By
v1
v2 (latest)
InstructPix2Pix: Learning to Follow Image Editing Instructions
Computer Vision and Pattern Recognition (CVPR), 2022
17 November 2022
Tim Brooks
Aleksander Holynski
Alexei A. Efros
DiffM
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (4 upvotes)
Papers citing
"InstructPix2Pix: Learning to Follow Image Editing Instructions"
50 / 1,733 papers shown
GPT4Tools: Teaching Large Language Model to Use Tools via Self-instruction
Neural Information Processing Systems (NeurIPS), 2023
Rui Yang
Lin Song
Yanwei Li
Sijie Zhao
Yixiao Ge
Xiu Li
Ying Shan
SyDa
MLLM
284
294
0
30 May 2023
Real-World Image Variation by Aligning Diffusion Inversion Chain
Neural Information Processing Systems (NeurIPS), 2023
Yuechen Zhang
Jinbo Xing
Eric Lo
Jiaya Jia
336
42
0
30 May 2023
Controllable Text-to-Image Generation with GPT-4
Tianjun Zhang
Yi Zhang
Vibhav Vineet
Neel Joshi
Xin Eric Wang
DiffM
347
61
0
29 May 2023
InstructEdit: Improving Automatic Masks for Diffusion-based Image Editing With User Instructions
Qian Wang
Biao Zhang
Michael Birsak
Peter Wonka
DiffM
249
56
0
29 May 2023
FuseCap: Leveraging Large Language Models for Enriched Fused Image Captions
IEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2023
Noam Rotstein
David Bensaid
Shaked Brody
Roy Ganz
Ron Kimmel
VLM
392
53
0
28 May 2023
Accelerating Text-to-Image Editing via Cache-Enabled Sparse Diffusion Inference
AAAI Conference on Artificial Intelligence (AAAI), 2023
Zihao Yu
Haoyang Li
Fangcheng Fu
Xupeng Miao
Tengjiao Wang
DiffM
312
16
0
27 May 2023
CRoSS: Diffusion Model Makes Controllable, Robust and Secure Image Steganography
Neural Information Processing Systems (NeurIPS), 2023
Jiwen Yu
Xuanyu Zhang
You-song Xu
Jian Zhang
DiffM
314
95
0
26 May 2023
Uni-ControlNet: All-in-One Control to Text-to-Image Diffusion Models
Neural Information Processing Systems (NeurIPS), 2023
Shihao Zhao
Dongdong Chen
Yen-Chun Chen
Jianmin Bao
Shaozhe Hao
Lu Yuan
Kwan-Yee K. Wong
424
392
0
25 May 2023
Break-A-Scene: Extracting Multiple Concepts from a Single Image
ACM SIGGRAPH Conference and Exhibition on Computer Graphics and Interactive Techniques in Asia (SIGGRAPH Asia), 2023
Omri Avrahami
Kfir Aberman
Ohad Fried
Daniel Cohen-Or
Dani Lischinski
VLM
DiffM
253
241
0
25 May 2023
Diversify Your Vision Datasets with Automatic Diffusion-Based Augmentation
Neural Information Processing Systems (NeurIPS), 2023
Lisa Dunlap
Alyssa Umino
Han Zhang
Jiezhi Yang
Joseph E. Gonzalez
Trevor Darrell
DiffM
364
111
0
25 May 2023
CommonScenes: Generating Commonsense 3D Indoor Scenes with Scene Graph Diffusion
Guangyao Zhai
Evin Pınar Örnek
Shun-cheng Wu
Yan Di
F. Tombari
Nassir Navab
Benjamin Busam
DiffM
353
31
0
25 May 2023
ProSpect: Prompt Spectrum for Attribute-Aware Personalization of Diffusion Models
ACM Transactions on Graphics (TOG), 2023
Yuxin Zhang
Weiming Dong
Fan Tang
Nisha Huang
Haibin Huang
Chongyang Ma
Tong-Yee Lee
Oliver Deussen
Changsheng Xu
DiffM
433
120
0
25 May 2023
Towards Language-guided Interactive 3D Generation: LLMs as Layout Interpreter with Generative Feedback
Yiqi Lin
Hao Wu
Ruichen Wang
H. Lu
Xiaodong Lin
Hui Xiong
Lin Wang
3DV
180
17
0
25 May 2023
Balancing the Picture: Debiasing Vision-Language Datasets with Synthetic Contrast Sets
Brandon Smith
Miguel Farinha
Elizaveta Semenova
Hannah Rose Kirk
Aleksandar Shtedritski
Max Bain
259
26
0
24 May 2023
LayoutGPT: Compositional Visual Planning and Generation with Large Language Models
Neural Information Processing Systems (NeurIPS), 2023
Weixi Feng
Wanrong Zhu
Tsu-Jui Fu
Varun Jampani
Arjun Reddy Akula
Xuehai He
Sugato Basu
Xinze Wang
William Yang Wang
MLLM
512
293
0
24 May 2023
InNeRF360: Text-Guided 3D-Consistent Object Inpainting on 360-degree Neural Radiance Fields
Computer Vision and Pattern Recognition (CVPR), 2023
Dongqing Wang
Tong Zhang
Alaa Abboud
Sabine Süsstrunk
186
17
0
24 May 2023
ChatFace: Chat-Guided Real Face Editing via Diffusion Latent Space Manipulation
Dongxu Yue
Qin Guo
Munan Ning
Jiaxi Cui
Yuesheng Zhu
Liuliang Yuan
DiffM
318
15
0
24 May 2023
I Spy a Metaphor: Large Language Models and Diffusion Models Co-Create Visual Metaphors
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Tuhin Chakrabarty
Arkadiy Saakyan
Olivia Winn
Artemis Panagopoulou
Yue Yang
Marianna Apidianaki
Smaranda Muresan
DiffM
220
62
0
24 May 2023
BLIP-Diffusion: Pre-trained Subject Representation for Controllable Text-to-Image Generation and Editing
Neural Information Processing Systems (NeurIPS), 2023
Dongxu Li
Junnan Li
Steven C. H. Hoi
434
467
0
24 May 2023
Vision + Language Applications: A Survey
Yutong Zhou
N. Shimada
VLM
277
13
0
24 May 2023
Image Manipulation via Multi-Hop Instructions -- A New Dataset and Weakly-Supervised Neuro-Symbolic Approach
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Harman Singh
Poorva Garg
M. Gupta
Kevin Shah
Ashish Goswami
A. Mondal
Arnab Kumar Mondal
Dinesh Khandelwal
Dinesh Garg
Parag Singla
LM&Ro
155
2
0
23 May 2023
DirecT2V: Large Language Models are Frame-Level Directors for Zero-Shot Text-to-Video Generation
Susung Hong
Junyoung Seo
Heeseong Shin
Sung‐Jin Hong
Seung Wook Kim
DiffM
VGen
294
54
0
23 May 2023
Control-A-Video: Controllable Text-to-Video Generation with Diffusion Models
Weifeng Chen
Yatai Ji
Jie Wu
Hefeng Wu
Pan Xie
Jiashi Li
Xin Xia
Xuefeng Xiao
Liang Lin
VGen
382
92
0
23 May 2023
LLM-grounded Diffusion: Enhancing Prompt Understanding of Text-to-Image Diffusion Models with Large Language Models
Long Lian
Boyi Li
Adam Yala
Trevor Darrell
340
220
0
23 May 2023
Interactive Data Synthesis for Systematic Vision Adaptation via LLMs-AIGCs Collaboration
Qifan Yu
Juncheng Li
Wentao Ye
Siliang Tang
Yueting Zhuang
208
14
0
22 May 2023
The CLIP Model is Secretly an Image-to-Prompt Converter
Neural Information Processing Systems (NeurIPS), 2023
Yuxuan Ding
Chunna Tian
Haoxuan Ding
Lingqiao Liu
DiffM
150
17
0
22 May 2023
Guided Motion Diffusion for Controllable Human Motion Synthesis
IEEE International Conference on Computer Vision (ICCV), 2023
Korrawe Karunratanakul
Konpat Preechakul
Supasorn Suwajanakorn
Siyu Tang
DiffM
445
206
0
21 May 2023
InstructVid2Vid: Controllable Video Editing with Natural Language Instructions
IEEE International Conference on Multimedia and Expo (ICME), 2023
Bosheng Qin
Juncheng Li
Siliang Tang
Tat-Seng Chua
Yueting Zhuang
VGen
DiffM
272
31
0
21 May 2023
Chupa: Carving 3D Clothed Humans from Skinned Shape Priors using 2D Diffusion Probabilistic Models
IEEE International Conference on Computer Vision (ICCV), 2023
Byungjun Kim
Patrick Kwon
K. Lee
Myunggi Lee
Sookwan Han
Daesik Kim
Hanbyul Joo
DiffM
399
26
0
19 May 2023
RoomDreamer: Text-Driven 3D Indoor Scene Synthesis with Coherent Geometry and Texture
ACM Multimedia (ACM MM), 2023
Liangchen Song
Liangliang Cao
Hongyu Xu
Kai Kang
Feng Tang
Junsong Yuan
Yang Zhao
VGen
DiffM
236
59
0
18 May 2023
LLMScore: Unveiling the Power of Large Language Models in Text-to-Image Synthesis Evaluation
Neural Information Processing Systems (NeurIPS), 2023
Yujie Lu
Xianjun Yang
Xiujun Li
Xinze Wang
William Yang Wang
EGVM
430
99
0
18 May 2023
DiffUTE: Universal Text Editing Diffusion Model
Neural Information Processing Systems (NeurIPS), 2023
Haoxing Chen
Zhuoer Xu
Zhangxuan Gu
Jun Lan
Xing Zheng
Yaohui Li
Changhua Meng
Huijia Zhu
Weiqiang Wang
DiffM
328
46
0
18 May 2023
Preserve Your Own Correlation: A Noise Prior for Video Diffusion Models
IEEE International Conference on Computer Vision (ICCV), 2023
Songwei Ge
Seungjun Nah
Guilin Liu
Tyler Poon
Andrew Tao
Bryan Catanzaro
David Jacobs
Jia-Bin Huang
Ming-Yuan Liu
Yogesh Balaji
DiffM
VGen
391
299
0
17 May 2023
Face Recognition Using Synthetic Face Data
Omer Granoviter
Alexey Gruzdev
V. Loginov
Max Kogan
Orly Zvitia
186
1
0
17 May 2023
Make-A-Protagonist: Generic Video Editing with An Ensemble of Experts
Yuyang Zhao
Enze Xie
Lanqing Hong
Zhenguo Li
G. Lee
DiffM
VGen
196
41
0
15 May 2023
Generative AI meets 3D: A Survey on Text-to-3D in AIGC Era
Chenghao Li
Chaoning Zhang
Atish Waghwase
Lik-Hang Lee
François Rameau
Yang Yang
Sung-Ho Bae
Choong Seon Hong
257
96
0
10 May 2023
iEdit: Localised Text-guided Image Editing with Weak Supervision
Rumeysa Bodur
Erhan Gundogdu
Binod Bhattarai
Tae-Kyun Kim
M. Donoser
Loris Bazzani
DiffM
200
20
0
10 May 2023
Text-guided High-definition Consistency Texture Model
Zhibin Tang
Tiantong He
DiffM
121
6
0
10 May 2023
Style-A-Video: Agile Diffusion for Arbitrary Text-based Video Style Transfer
IEEE Signal Processing Letters (IEEE SPL), 2023
Nisha Huang
Yuxin Zhang
Weiming Dong
DiffM
VGen
235
25
0
09 May 2023
ReGeneration Learning of Diffusion Models with Rich Prompts for Zero-Shot Image Translation
Yupei Lin
Senyang Zhang
Xiaojun Yang
Tianlin Li
Yukai Shi
DiffM
144
7
0
08 May 2023
AADiff: Audio-Aligned Video Synthesis with Text-to-Image Diffusion
Seungwoo Lee
Chaerin Kong
D. Jeon
Nojun Kwak
DiffM
290
24
0
06 May 2023
DisenBooth: Identity-Preserving Disentangled Tuning for Subject-Driven Text-to-Image Generation
International Conference on Learning Representations (ICLR), 2023
Hong Chen
Yipeng Zhang
Simin Wu
Xin Eric Wang
Xuguang Duan
Yuwei Zhou
Wenwu Zhu
DiffM
354
73
0
05 May 2023
Multimodal Procedural Planning via Dual Text-Image Prompting
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Yujie Lu
Pan Lu
Zhiyu Zoey Chen
Wanrong Zhu
Xinze Wang
William Yang Wang
LM&Ro
234
53
0
02 May 2023
Key-Locked Rank One Editing for Text-to-Image Personalization
International Conference on Computer Graphics and Interactive Techniques (SIGGRAPH), 2023
Yoad Tewel
Rinon Gal
Gal Chechik
Yuval Atzmon
DiffM
435
218
0
02 May 2023
In-Context Learning Unlocked for Diffusion Models
Neural Information Processing Systems (NeurIPS), 2023
Zhendong Wang
Lezhi Li
Yadong Lu
Yelong Shen
Pengcheng He
Weizhu Chen
Zinan Lin
Mingyuan Zhou
VLM
DiffM
338
97
0
01 May 2023
Let the Chart Spark: Embedding Semantic Context into Chart with Text-to-Image Generative Model
IEEE Transactions on Visualization and Computer Graphics (TVCG), 2023
Shishi Xiao
Suizi Huang
Yue Lin
Yilin Ye
Weizhen Zeng
346
47
0
28 Apr 2023
IconShop: Text-Guided Vector Icon Synthesis with Autoregressive Transformers
ACM Transactions on Graphics (TOG), 2023
Rong Wu
Wanchao Su
Kede Ma
Jing Liao
490
63
0
27 Apr 2023
Learning Human-Human Interactions in Images from Weak Textual Supervision
IEEE International Conference on Computer Vision (ICCV), 2023
Morris Alper
Hadar Averbuch-Elor
VLM
385
3
0
27 Apr 2023
Patch Diffusion: Faster and More Data-Efficient Training of Diffusion Models
Neural Information Processing Systems (NeurIPS), 2023
Zhendong Wang
Lezhi Li
Huangjie Zheng
Peihao Wang
Pengcheng He
Zinan Lin
Weizhu Chen
Mingyuan Zhou
256
160
0
25 Apr 2023
SINC: Spatial Composition of 3D Human Motions for Simultaneous Action Generation
IEEE International Conference on Computer Vision (ICCV), 2023
Nikos Athanasiou
Mathis Petrovich
Michael J. Black
Gül Varol
443
61
0
20 Apr 2023
Previous
1
2
3
...
32
33
34
35
Next
Page 33 of 35
Page
of 35
Go