ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2211.09800
  4. Cited By
InstructPix2Pix: Learning to Follow Image Editing Instructions

InstructPix2Pix: Learning to Follow Image Editing Instructions

17 November 2022
Tim Brooks
Aleksander Holynski
Alexei A. Efros
    DiffM
ArXivPDFHTML

Papers citing "InstructPix2Pix: Learning to Follow Image Editing Instructions"

50 / 290 papers shown
Title
Adding Conditional Control to Diffusion Models with Reinforcement Learning
Adding Conditional Control to Diffusion Models with Reinforcement Learning
Yulai Zhao
Masatoshi Uehara
Gabriele Scalia
Tommaso Biancalani
Sergey Levine
Ehsan Hajiramezanali
Ehsan Hajiramezanali
AI4CE
52
3
0
17 Jun 2024
MMFakeBench: A Mixed-Source Multimodal Misinformation Detection Benchmark for LVLMs
MMFakeBench: A Mixed-Source Multimodal Misinformation Detection Benchmark for LVLMs
Xuannan Liu
Zekun Li
Peipei Li
Shuhan Xia
Xing Cui
Linzhi Huang
Huaibo Huang
Weihong Deng
Zhaofeng He
36
11
0
13 Jun 2024
Is One GPU Enough? Pushing Image Generation at Higher-Resolutions with
  Foundation Models
Is One GPU Enough? Pushing Image Generation at Higher-Resolutions with Foundation Models
Athanasios Tragakis
Marco Aversa
Chaitanya Kaul
Roderick Murray-Smith
Daniele Faccio
44
2
0
11 Jun 2024
MS-Diffusion: Multi-subject Zero-shot Image Personalization with Layout Guidance
MS-Diffusion: Multi-subject Zero-shot Image Personalization with Layout Guidance
X. Wang
Siming Fu
Qihan Huang
Wanggui He
Hao Jiang
DiffM
34
41
0
11 Jun 2024
Diffusion-RPO: Aligning Diffusion Models through Relative Preference
  Optimization
Diffusion-RPO: Aligning Diffusion Models through Relative Preference Optimization
Yi Gu
Zhendong Wang
Yueqin Yin
Yujia Xie
Mingyuan Zhou
24
15
0
10 Jun 2024
Towards Semantic Equivalence of Tokenization in Multimodal LLM
Towards Semantic Equivalence of Tokenization in Multimodal LLM
Shengqiong Wu
Hao Fei
Xiangtai Li
Jiayi Ji
Hanwang Zhang
Tat-Seng Chua
Shuicheng Yan
MLLM
59
25
0
07 Jun 2024
Bayesian Power Steering: An Effective Approach for Domain Adaptation of
  Diffusion Models
Bayesian Power Steering: An Effective Approach for Domain Adaptation of Diffusion Models
Ding Huang
Ting Li
Jian Huang
DiffM
26
1
0
06 Jun 2024
Dream-in-Style: Text-to-3D Generation Using Stylized Score Distillation
Dream-in-Style: Text-to-3D Generation Using Stylized Score Distillation
Hubert Kompanowski
Binh-Son Hua
DiffM
52
3
0
05 Jun 2024
Enhancing Temporal Consistency in Video Editing by Reconstructing Videos with 3D Gaussian Splatting
Enhancing Temporal Consistency in Video Editing by Reconstructing Videos with 3D Gaussian Splatting
Inkyu Shin
Qihang Yu
Xiaohui Shen
In So Kweon
KuK-Jin Yoon
Liang-Chieh Chen
VGen
DiffM
66
1
0
04 Jun 2024
Vista: A Generalizable Driving World Model with High Fidelity and
  Versatile Controllability
Vista: A Generalizable Driving World Model with High Fidelity and Versatile Controllability
Shenyuan Gao
Jiazhi Yang
Li Chen
Kashyap Chitta
Yihang Qiu
Andreas Geiger
Jun Zhang
Hongyang Li
60
75
0
27 May 2024
I2VEdit: First-Frame-Guided Video Editing via Image-to-Video Diffusion
  Models
I2VEdit: First-Frame-Guided Video Editing via Image-to-Video Diffusion Models
Wenqi Ouyang
Yi Dong
Lei Yang
Jianlou Si
Xingang Pan
VGen
DiffM
37
11
0
26 May 2024
User-Friendly Customized Generation with Multi-Modal Prompts
User-Friendly Customized Generation with Multi-Modal Prompts
Linhao Zhong
Yan Hong
Wentao Chen
Binglin Zhou
Yiyi Zhang
Jianfu Zhang
Liqing Zhang
DiffM
35
0
0
26 May 2024
ModelLock: Locking Your Model With a Spell
ModelLock: Locking Your Model With a Spell
Yifeng Gao
Yuhua Sun
Xingjun Ma
Zuxuan Wu
Yu-Gang Jiang
VLM
40
1
0
25 May 2024
Challenges and Opportunities in 3D Content Generation
Challenges and Opportunities in 3D Content Generation
Ke Zhao
Andreas Larsen
22
0
0
24 May 2024
Looking Backward: Streaming Video-to-Video Translation with Feature Banks
Looking Backward: Streaming Video-to-Video Translation with Feature Banks
Feng Liang
Akio Kodaira
Chenfeng Xu
M. Tomizuka
Kurt Keutzer
Diana Marculescu
DiffM
VGen
61
7
0
24 May 2024
Personalized Residuals for Concept-Driven Text-to-Image Generation
Personalized Residuals for Concept-Driven Text-to-Image Generation
Cusuh Ham
Matthew Fisher
James Hays
Nicholas I. Kolkin
Yuchen Liu
Richard Y. Zhang
Tobias Hinz
DiffM
29
7
0
21 May 2024
Customize Your Own Paired Data via Few-shot Way
Customize Your Own Paired Data via Few-shot Way
Jinshu Chen
Bingchuan Li
Miao Hua
Panpan Xu
Qian He
DiffM
27
0
0
21 May 2024
Images that Sound: Composing Images and Sounds on a Single Canvas
Images that Sound: Composing Images and Sounds on a Single Canvas
Ziyang Chen
Daniel Geng
Andrew Owens
DiffM
46
8
0
20 May 2024
ReasonPix2Pix: Instruction Reasoning Dataset for Advanced Image Editing
ReasonPix2Pix: Instruction Reasoning Dataset for Advanced Image Editing
Ying Jin
Pengyang Ling
Xiao-wen Dong
Pan Zhang
Jiaqi Wang
Dahua Lin
24
2
0
18 May 2024
GaussianVTON: 3D Human Virtual Try-ON via Multi-Stage Gaussian Splatting
  Editing with Image Prompting
GaussianVTON: 3D Human Virtual Try-ON via Multi-Stage Gaussian Splatting Editing with Image Prompting
Haodong Chen
Yongle Huang
Haojian Huang
Xiangsheng Ge
Dian Shao
DiffM
32
11
0
13 May 2024
Exploring Text-based Realistic Building Facades Editing Applicaiton
Exploring Text-based Realistic Building Facades Editing Applicaiton
Jing Wang
Xin Zhang
AI4CE
31
1
0
05 May 2024
DGE: Direct Gaussian 3D Editing by Consistent Multi-view Editing
DGE: Direct Gaussian 3D Editing by Consistent Multi-view Editing
Minghao Chen
Iro Laina
Andrea Vedaldi
3DGS
38
23
0
29 Apr 2024
G-Refine: A General Quality Refiner for Text-to-Image Generation
G-Refine: A General Quality Refiner for Text-to-Image Generation
Chunyi Li
Haoning Wu
Hongkun Hao
Zicheng Zhang
Tengchaun Kou
Chaofeng Chen
Lei Bai
Xiaohong Liu
Weisi Lin
Guangtao Zhai
25
4
0
29 Apr 2024
Paint by Inpaint: Learning to Add Image Objects by Removing Them First
Paint by Inpaint: Learning to Add Image Objects by Removing Them First
Navve Wasserman
Noam Rotstein
Roy Ganz
Ron Kimmel
DiffM
34
14
0
28 Apr 2024
ObjectAdd: Adding Objects into Image via a Training-Free Diffusion Modification Fashion
ObjectAdd: Adding Objects into Image via a Training-Free Diffusion Modification Fashion
Ziyue Zhang
Mingbao Lin
Rongrong Ji
Yuxin Zhang
Rongrong Ji
DiffM
49
3
0
26 Apr 2024
Gorgeous: Create Your Desired Character Facial Makeup from Any Ideas
Gorgeous: Create Your Desired Character Facial Makeup from Any Ideas
Jia Wei Sii
Chee Seng Chan
DiffM
43
0
0
22 Apr 2024
A Multimodal Automated Interpretability Agent
A Multimodal Automated Interpretability Agent
Tamar Rott Shaham
Sarah Schwettmann
Franklin Wang
Achyuta Rajaram
Evan Hernandez
Jacob Andreas
Antonio Torralba
16
17
0
22 Apr 2024
LASER: Tuning-Free LLM-Driven Attention Control for Efficient Text-conditioned Image-to-Animation
LASER: Tuning-Free LLM-Driven Attention Control for Efficient Text-conditioned Image-to-Animation
Haoyu Zheng
Wenqiao Zhang
Yaoke Wang
Hao Zhou
Jiang Liu
Juncheng Li
Zheqi Lv
Siliang Tang
Yueting Zhuang
Yueting Zhuang
32
1
0
21 Apr 2024
Generating Daylight-driven Architectural Design via Diffusion Models
Generating Daylight-driven Architectural Design via Diffusion Models
Pengzhi Li
Baijuan Li
AI4CE
DiffM
26
11
0
20 Apr 2024
Physical Backdoor Attack can Jeopardize Driving with
  Vision-Large-Language Models
Physical Backdoor Attack can Jeopardize Driving with Vision-Large-Language Models
Zhenyang Ni
Rui Ye
Yuxian Wei
Zhen Xiang
Yanfeng Wang
Siheng Chen
AAML
32
9
0
19 Apr 2024
Factorized Diffusion: Perceptual Illusions by Noise Decomposition
Factorized Diffusion: Perceptual Illusions by Noise Decomposition
Daniel Geng
Inbum Park
Andrew Owens
DiffM
36
13
0
17 Apr 2024
MaxFusion: Plug&Play Multi-Modal Generation in Text-to-Image Diffusion
  Models
MaxFusion: Plug&Play Multi-Modal Generation in Text-to-Image Diffusion Models
Nithin Gopalakrishnan Nair
Jeya Maria Jose Valanarasu
Vishal M. Patel
MoMe
33
7
0
15 Apr 2024
RealmDreamer: Text-Driven 3D Scene Generation with Inpainting and Depth Diffusion
RealmDreamer: Text-Driven 3D Scene Generation with Inpainting and Depth Diffusion
Jaidev Shriram
Alex Trevithick
Lingjie Liu
Ravi Ramamoorthi
DiffM
3DGS
65
55
0
10 Apr 2024
InstructHumans: Editing Animated 3D Human Textures with Instructions
InstructHumans: Editing Animated 3D Human Textures with Instructions
Jiayin Zhu
Linlin Yang
Angela Yao
DiffM
28
1
0
05 Apr 2024
Gen3DSR: Generalizable 3D Scene Reconstruction via Divide and Conquer from a Single View
Gen3DSR: Generalizable 3D Scene Reconstruction via Divide and Conquer from a Single View
Andreea Dogaru
M. Ozer
Bernhard Egger
3DGS
51
4
0
04 Apr 2024
Benchmarking Counterfactual Image Generation
Benchmarking Counterfactual Image Generation
Thomas Melistas
Nikos Spyrou
Nefeli Gkouti
Pedro Sanchez
Athanasios Vlontzos
Yannis Panagakis
G. Papanastasiou
Sotirios A. Tsaftaris
EGVM
CML
35
5
0
29 Mar 2024
Continuous, Subject-Specific Attribute Control in T2I Models by Identifying Semantic Directions
Continuous, Subject-Specific Attribute Control in T2I Models by Identifying Semantic Directions
S. A. Baumann
Felix Krause
Michael Neumayr
Nick Stracke
Vincent Tao Hu
Bjorn Ommer
Björn Ommer
DiffM
LM&Ro
66
11
0
25 Mar 2024
Explore In-Context Segmentation via Latent Diffusion Models
Explore In-Context Segmentation via Latent Diffusion Models
Chaoyang Wang
Xiangtai Li
Henghui Ding
Lu Qi
Jiangning Zhang
Yunhai Tong
Chen Change Loy
Shuicheng Yan
DiffM
63
6
0
14 Mar 2024
XPSR: Cross-modal Priors for Diffusion-based Image Super-Resolution
XPSR: Cross-modal Priors for Diffusion-based Image Super-Resolution
Yunpeng Qu
Kun Yuan
Kai Zhao
Qizhi Xie
Jinhua Hao
Ming-hui Sun
Chao Zhou
24
16
0
08 Mar 2024
Scaling Rectified Flow Transformers for High-Resolution Image Synthesis
Scaling Rectified Flow Transformers for High-Resolution Image Synthesis
Patrick Esser
Sumith Kulal
A. Blattmann
Rahim Entezari
Jonas Muller
...
Zion English
Kyle Lacey
Alex Goodwin
Yannik Marek
Robin Rombach
DiffM
56
1,047
0
05 Mar 2024
From Summary to Action: Enhancing Large Language Models for Complex
  Tasks with Open World APIs
From Summary to Action: Enhancing Large Language Models for Complex Tasks with Open World APIs
Yulong Liu
Yunlong Yuan
Chunwei Wang
Jianhua Han
Yongqiang Ma
Li Zhang
Nanning Zheng
Hang Xu
LLMAG
21
5
0
28 Feb 2024
Diffusion Model-Based Image Editing: A Survey
Diffusion Model-Based Image Editing: A Survey
Yi Huang
Jiancheng Huang
Yifan Liu
Mingfu Yan
Jiaxi Lv
Jianzhuang Liu
Wei Xiong
He Zhang
Liangliang Cao
Liangliang Cao
EGVM
66
84
0
27 Feb 2024
Snap Video: Scaled Spatiotemporal Transformers for Text-to-Video
  Synthesis
Snap Video: Scaled Spatiotemporal Transformers for Text-to-Video Synthesis
Willi Menapace
Aliaksandr Siarohin
Ivan Skorokhodov
Ekaterina Deyneka
Tsai-Shien Chen
...
Yuwei Fang
A. Stoliar
Elisa Ricci
Jian Ren
Sergey Tulyakov
VGen
38
56
0
22 Feb 2024
SPAD : Spatially Aware Multiview Diffusers
SPAD : Spatially Aware Multiview Diffusers
Yash Kant
Ziyi Wu
Michael Vasilkovsky
Guocheng Qian
Jian Ren
R. A. Guler
Bernard Ghanem
Sergey Tulyakov
Igor Gilitschenski
Aliaksandr Siarohin
DiffM
22
34
0
07 Feb 2024
IntentTuner: An Interactive Framework for Integrating Human Intents in
  Fine-tuning Text-to-Image Generative Models
IntentTuner: An Interactive Framework for Integrating Human Intents in Fine-tuning Text-to-Image Generative Models
Xingchen Zeng
Ziyao Gao
Yilin Ye
Wei Zeng
12
12
0
28 Jan 2024
TIP-Editor: An Accurate 3D Editor Following Both Text-Prompts And
  Image-Prompts
TIP-Editor: An Accurate 3D Editor Following Both Text-Prompts And Image-Prompts
Jingyu Zhuang
Di Kang
Yan-Pei Cao
Guanbin Li
Liang Lin
Ying Shan
DiffM
3DGS
30
38
0
26 Jan 2024
CCA: Collaborative Competitive Agents for Image Editing
CCA: Collaborative Competitive Agents for Image Editing
Tiankai Hang
Shuyang Gu
Dong Chen
Xin Geng
Baining Guo
20
5
0
23 Jan 2024
Fast Registration of Photorealistic Avatars for VR Facial Animation
Fast Registration of Photorealistic Avatars for VR Facial Animation
Chaitanya Patel
Shaojie Bai
Tenia Wang
Jason M. Saragih
S. Wei
17
0
0
19 Jan 2024
BlenDA: Domain Adaptive Object Detection through diffusion-based
  blending
BlenDA: Domain Adaptive Object Detection through diffusion-based blending
Tzuhsuan Huang
Chen-Che Huang
Chung-Hao Ku
Jun-Cheng Chen
29
5
0
18 Jan 2024
Image Translation as Diffusion Visual Programmers
Image Translation as Diffusion Visual Programmers
Cheng Han
James Liang
Qifan Wang
Majid Rabbani
S. Dianat
Raghuveer M. Rao
Ying Nian Wu
Dongfang Liu
21
8
0
18 Jan 2024
Previous
123456
Next