Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2204.08583
Cited By
VQGAN-CLIP: Open Domain Image Generation and Editing with Natural Language Guidance
18 April 2022
Katherine Crowson
Stella Biderman
Daniel Kornis
Dashiell Stander
Eric Hallahan
Louis Castricato
Edward Raff
CLIP
Re-assign community
ArXiv
PDF
HTML
Papers citing
"VQGAN-CLIP: Open Domain Image Generation and Editing with Natural Language Guidance"
50 / 255 papers shown
Title
TextGaze: Gaze-Controllable Face Generation with Natural Language
Hengfei Wang
Zhongqun Zhang
Yihua Cheng
Hyung Jin Chang
DiffM
25
2
0
26 Apr 2024
LASER: Tuning-Free LLM-Driven Attention Control for Efficient Text-conditioned Image-to-Animation
Haoyu Zheng
Wenqiao Zhang
Yaoke Wang
Hao Zhou
Jiang Liu
Juncheng Li
Zheqi Lv
Siliang Tang
Yueting Zhuang
Yueting Zhuang
16
1
0
21 Apr 2024
TiNO-Edit: Timestep and Noise Optimization for Robust Diffusion-Based Image Editing
Sherry X Chen
Yaron Vaxman
Elad Ben Baruch
David Asulin
Aviad Moreshet
Kuo-Chin Lien
Misha Sra
Pradeep Sen
23
3
0
17 Apr 2024
ANCHOR: LLM-driven News Subject Conditioning for Text-to-Image Synthesis
Aashish Anantha Ramakrishnan
Sharon X. Huang
Dongwon Lee
16
0
0
15 Apr 2024
RankCLIP: Ranking-Consistent Language-Image Pretraining
Yiming Zhang
Zhuokai Zhao
Zhaorun Chen
Zhili Feng
Zenghui Ding
Yining Sun
SSL
VLM
29
7
0
15 Apr 2024
SwapAnything: Enabling Arbitrary Object Swapping in Personalized Visual Editing
Jing Gu
Yilin Wang
Nanxuan Zhao
Wei Xiong
Qing Liu
Zhifei Zhang
He Zhang
Jianming Zhang
HyunJoon Jung
Xin Eric Wang
DiffM
14
8
0
08 Apr 2024
Dynamic Prompt Optimizing for Text-to-Image Generation
Wenyi Mo
Tianyu Zhang
Yalong Bai
Bing-Huang Su
Ji-Rong Wen
Qing Yang
22
0
0
05 Apr 2024
Cross-Modal Conditioned Reconstruction for Language-guided Medical Image Segmentation
Xiaoshuang Huang
Hongxiang Li
Meng Cao
Long Chen
Chenyu You
Dong An
VLM
33
5
0
03 Apr 2024
CLIPtone: Unsupervised Learning for Text-based Image Tone Adjustment
Hyeongmin Lee
Kyoungkook Kang
Jungseul Ok
Sunghyun Cho
CLIP
19
2
0
01 Apr 2024
Dual Memory Networks: A Versatile Adaptation Approach for Vision-Language Models
Yabin Zhang
Wen-Qing Zhu
Hui Tang
Zhiyuan Ma
Kaiyang Zhou
Lei Zhang
VLM
13
20
0
26 Mar 2024
Long-CLIP: Unlocking the Long-Text Capability of CLIP
Beichen Zhang
Pan Zhang
Xiao-wen Dong
Yuhang Zang
Jiaqi Wang
CLIP
VLM
18
106
0
22 Mar 2024
Magic Fixup: Streamlining Photo Editing by Watching Dynamic Videos
Hadi Alzayer
Zhihao Xia
Xuaner Zhang
Eli Shechtman
Jia-Bin Huang
Michael Gharbi
DiffM
VGen
19
5
0
19 Mar 2024
LogicalDefender: Discovering, Extracting, and Utilizing Common-Sense Knowledge
Yuhe Liu
Mengxue Kang
Zengchang Qin
Xiangxiang Chu
NAI
VLM
25
0
0
18 Mar 2024
EffiVED:Efficient Video Editing via Text-instruction Diffusion Models
Zhenghao Zhang
Zuozhuo Dai
Long Qin
Weizhi Wang
DiffM
VGen
27
0
0
18 Mar 2024
HyperVQ: MLR-based Vector Quantization in Hyperbolic Space
Nabarun Goswami
Yusuke Mukuta
Tatsuya Harada
27
3
0
18 Mar 2024
E4C: Enhance Editability for Text-Based Image Editing by Harnessing Efficient CLIP Guidance
Tianrui Huang
Pu Cao
Lu Yang
Chun Liu
Mengjie Hu
Zhiwei Liu
Qing-Huang Song
DiffM
23
0
0
15 Mar 2024
PrompTHis: Visualizing the Process and Influence of Prompt Editing during Text-to-Image Creation
Yuhan Guo
Hanning Shao
Can Liu
Kai Xu
Xiaoru Yuan
DiffM
16
0
0
14 Mar 2024
FaceChain-SuDe: Building Derived Class to Inherit Category Attributes for One-shot Subject-Driven Generation
Pengchong Qiao
Lei Shang
Chang-Shu Liu
Baigui Sun
Xiang Ji
Jie Chen
CVBM
25
0
0
11 Mar 2024
Text-Guided Variational Image Generation for Industrial Anomaly Detection and Segmentation
Mingyu Lee
Jongwon Choi
16
8
0
10 Mar 2024
Benchmarking Segmentation Models with Mask-Preserved Attribute Editing
Zijin Yin
Kongming Liang
Bing Li
Zhanyu Ma
Jun Guo
VLM
25
2
0
02 Mar 2024
Text-guided Explorable Image Super-resolution
Kanchana Vaishnavi Gandikota
Paramanand Chandramouli
21
1
0
02 Mar 2024
LoMOE: Localized Multi-Object Editing via Multi-Diffusion
Goirik Chakrabarty
Aditya Chandrasekar
Ramya Hebbalaguppe
AP Prathosh
DiffM
38
1
0
01 Mar 2024
CustomSketching: Sketch Concept Extraction for Sketch-based Image Synthesis and Editing
Chufeng Xiao
Hongbo Fu
DiffM
18
1
0
27 Feb 2024
Referee Can Play: An Alternative Approach to Conditional Generation via Model Inversion
Xuantong Liu
Tianyang Hu
Wenjia Wang
Kenji Kawaguchi
Yuan Yao
DiffM
24
3
0
26 Feb 2024
Human Aesthetic Preference-Based Large Text-to-Image Model Personalization: Kandinsky Generation as an Example
Aven Le Zhou
Yu-Ao Wang
Wei Wu
Kang Zhang
11
0
0
09 Feb 2024
Counterfactual Image Editing
Yushu Pan
Elias Bareinboim
BDL
CML
14
1
0
07 Feb 2024
LanDA: Language-Guided Multi-Source Domain Adaptation
Zhenbin Wang
Lei Zhang
Lituan Wang
Minjuan Zhu
10
10
0
25 Jan 2024
LLMRA: Multi-modal Large Language Model based Restoration Assistant
Xiaoyu Jin
Yuan Shi
Bin Xia
Wenming Yang
21
4
0
21 Jan 2024
Design Frameworks for Spatial Zone Agents in XRI Metaverse Smart Environments
Jie Guan
Jiamin Liu
Alexis Morris
11
0
0
19 Jan 2024
On mitigating stability-plasticity dilemma in CLIP-guided image morphing via geodesic distillation loss
Yeongtak Oh
Saehyung Lee
Uiwon Hwang
Sungroh Yoon
17
0
0
19 Jan 2024
Image Translation as Diffusion Visual Programmers
Cheng Han
James Liang
Qifan Wang
Majid Rabbani
S. Dianat
Raghuveer M. Rao
Ying Nian Wu
Dongfang Liu
8
8
0
18 Jan 2024
SpecRef: A Fast Training-free Baseline of Specific Reference-Condition Real Image Editing
Songyan Chen
Jiancheng Huang
DiffM
8
2
0
07 Jan 2024
Improving Diffusion-Based Image Synthesis with Context Prediction
Ling Yang
Jingwei Liu
Shenda Hong
Zhilong Zhang
Zhilin Huang
Zheming Cai
Wentao Zhang
Bin Cui
DiffM
27
17
0
04 Jan 2024
ZONE: Zero-Shot Instruction-Guided Local Editing
Shanglin Li
Bo-Wen Zeng
Yutang Feng
Sicheng Gao
Xuhui Liu
...
Li Lin
Xu Tang
Yao Hu
Jianzhuang Liu
Baochang Zhang
DiffM
8
30
0
28 Dec 2023
Tuning-Free Inversion-Enhanced Control for Consistent Image Editing
Xiaoyue Duan
Shuhao Cui
Guoliang Kang
Baochang Zhang
Zhengcong Fei
Mingyuan Fan
Junshi Huang
DiffM
13
8
0
22 Dec 2023
TagCLIP: A Local-to-Global Framework to Enhance Open-Vocabulary Multi-Label Classification of CLIP Without Training
Yuqi Lin
Minghao Chen
Kaipeng Zhang
Hengjia Li
Mingming Li
Zheng Yang
Dongqin Lv
Binbin Lin
Haifeng Liu
Deng Cai
CLIP
VLM
36
3
0
20 Dec 2023
T2M-HiFiGPT: Generating High Quality Human Motion from Textual Descriptions with Residual Discrete Representations
Congyi Wang
14
1
0
17 Dec 2023
LIME: Localized Image Editing via Attention Regularization in Diffusion Models
Enis Simsar
A. Tonioni
Yongqin Xian
Thomas Hofmann
Federico Tombari
DiffM
14
8
0
14 Dec 2023
CoIE: Chain-of-Instruct Editing for Multi-Attribute Face Manipulation
Zhenduo Zhang
Bowen Zhang
Guang Liu
17
1
0
13 Dec 2023
Alchemist: Parametric Control of Material Properties with Diffusion Models
Prafull Sharma
Varun Jampani
Yuanzhen Li
Xuhui Jia
Dmitry Lagun
Frédo Durand
William T. Freeman
Mark J. Matthews
DiffM
29
21
0
05 Dec 2023
Multimodality-guided Image Style Transfer using Cross-modal GAN Inversion
Hanyu Wang
Pengxiang Wu
Kevin Dela Rosa
Chen Wang
Abhinav Shrivastava
19
8
0
04 Dec 2023
StoryGPT-V: Large Language Models as Consistent Story Visualizers
Xiaoqian Shen
Mohamed Elhoseiny
VLM
85
9
0
04 Dec 2023
Diffusion Handles: Enabling 3D Edits for Diffusion Models by Lifting Activations to 3D
Karran Pandey
Paul Guerrero
Matheus Gadelha
Yannick Hold-Geoffroy
Karan Singh
Niloy Mitra
DiffM
13
30
0
02 Dec 2023
DEVIAS: Learning Disentangled Video Representations of Action and Scene for Holistic Video Understanding
Kyungho Bae
Geo Ahn
Youngrae Kim
Jinwoo Choi
11
0
0
30 Nov 2023
Synchronizing Vision and Language: Bidirectional Token-Masking AutoEncoder for Referring Image Segmentation
Minhyeok Lee
Dogyoon Lee
Jungho Lee
Suhwan Cho
Heeseung Choi
Ig-Jae Kim
Sangyoun Lee
15
0
0
29 Nov 2023
Text-Driven Image Editing via Learnable Regions
Yuanze Lin
Yi-Wen Chen
Yi-Hsuan Tsai
Lu Jiang
Ming-Hsuan Yang
DiffM
10
6
0
28 Nov 2023
Instruct2Attack: Language-Guided Semantic Adversarial Attacks
Jiang-Long Liu
Chen Wei
Yuxiang Guo
Heng Yu
Alan L. Yuille
S. Feizi
Chun Pong Lau
Rama Chellappa
DiffM
AAML
14
5
0
27 Nov 2023
Perceptual Image Compression with Cooperative Cross-Modal Side Information
Shiyu Qin
Bin Chen
Yujun Huang
Baoyi An
Tao Dai
Shu-Tao Xia
25
1
0
23 Nov 2023
DIFFNAT: Improving Diffusion Image Quality Using Natural Image Statistics
Aniket Roy
Maiterya Suin
Anshul B. Shah
Ketul Shah
Jiang-Long Liu
Rama Chellappa
DiffM
19
2
0
16 Nov 2023
ChatGPT-Powered Hierarchical Comparisons for Image Classification
Zhiyuan Ren
Yiyang Su
Xiaoming Liu
VLM
27
9
0
01 Nov 2023
Previous
1
2
3
4
5
6
Next