ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2204.08583
  4. Cited By
VQGAN-CLIP: Open Domain Image Generation and Editing with Natural
  Language Guidance

VQGAN-CLIP: Open Domain Image Generation and Editing with Natural Language Guidance

18 April 2022
Katherine Crowson
Stella Biderman
Daniel Kornis
Dashiell Stander
Eric Hallahan
Louis Castricato
Edward Raff
    CLIP
ArXivPDFHTML

Papers citing "VQGAN-CLIP: Open Domain Image Generation and Editing with Natural Language Guidance"

50 / 255 papers shown
Title
UPGPT: Universal Diffusion Model for Person Image Generation, Editing
  and Pose Transfer
UPGPT: Universal Diffusion Model for Person Image Generation, Editing and Pose Transfer
Soon Yau Cheong
A. Mustafa
Andrew Gilbert
DiffM
17
12
0
18 Apr 2023
MasaCtrl: Tuning-Free Mutual Self-Attention Control for Consistent Image
  Synthesis and Editing
MasaCtrl: Tuning-Free Mutual Self-Attention Control for Consistent Image Synthesis and Editing
Ming Cao
Xintao Wang
Zhongang Qi
Ying Shan
Xiaohu Qie
Yinqiang Zheng
DiffM
20
420
0
17 Apr 2023
Delta Denoising Score
Delta Denoising Score
Amir Hertz
Kfir Aberman
Daniel Cohen-Or
DiffM
10
88
0
14 Apr 2023
Soundini: Sound-Guided Diffusion for Natural Video Editing
Soundini: Sound-Guided Diffusion for Natural Video Editing
Seung Hyun Lee
Si-Yeol Kim
Innfarn Yoo
Feng Yang
Donghyeon Cho
Youngseo Kim
Huiwen Chang
Jinkyu Kim
Sangpil Kim
VGen
DiffM
14
15
0
13 Apr 2023
Gradient-Free Textual Inversion
Gradient-Free Textual Inversion
Zhengcong Fei
Mingyuan Fan
Junshi Huang
DiffM
8
31
0
12 Apr 2023
Defense-Prefix for Preventing Typographic Attacks on CLIP
Defense-Prefix for Preventing Typographic Attacks on CLIP
Hiroki Azuma
Yusuke Matsui
VLM
AAML
11
16
0
10 Apr 2023
InstantBooth: Personalized Text-to-Image Generation without Test-Time
  Finetuning
InstantBooth: Personalized Text-to-Image Generation without Test-Time Finetuning
Jing Shi
Wei Xiong
Zhe-nan Lin
H. J. Jung
DiffM
113
272
0
06 Apr 2023
Exposing and Mitigating Spurious Correlations for Cross-Modal Retrieval
Exposing and Mitigating Spurious Correlations for Cross-Modal Retrieval
Jae Myung Kim
A. Sophia Koepke
Cordelia Schmid
Zeynep Akata
68
25
0
06 Apr 2023
Pythia: A Suite for Analyzing Large Language Models Across Training and
  Scaling
Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling
Stella Biderman
Hailey Schoelkopf
Quentin G. Anthony
Herbie Bradley
Kyle O'Brien
...
USVSN Sai Prashanth
Edward Raff
Aviya Skowron
Lintang Sutawika
Oskar van der Wal
8
780
0
03 Apr 2023
DreamFace: Progressive Generation of Animatable 3D Faces under Text
  Guidance
DreamFace: Progressive Generation of Animatable 3D Faces under Text Guidance
Longwen Zhang
Qiwei Qiu
Hongyang Lin
Qixuan Zhang
Cheng Shi
Wei Yang
Ye Shi
Sibei Yang
Lan Xu
Jingyi Yu
3DH
17
60
0
01 Apr 2023
Forget-Me-Not: Learning to Forget in Text-to-Image Diffusion Models
Forget-Me-Not: Learning to Forget in Text-to-Image Diffusion Models
Eric Zhang
Kai Wang
Xingqian Xu
Zhangyang Wang
Humphrey Shi
DiffM
24
169
0
30 Mar 2023
Discriminative Class Tokens for Text-to-Image Diffusion Models
Discriminative Class Tokens for Text-to-Image Diffusion Models
Idan Schwartz
Vésteinn Snaebjarnarson
Hila Chefer
Ryan Cotterell
Serge J. Belongie
Lior Wolf
Sagie Benaim
6
9
0
30 Mar 2023
MDP: A Generalized Framework for Text-Guided Image Editing by
  Manipulating the Diffusion Path
MDP: A Generalized Framework for Text-Guided Image Editing by Manipulating the Diffusion Path
Qian Wang
Biao Zhang
Michael Birsak
Peter Wonka
DiffM
6
17
0
29 Mar 2023
StyleDiffusion: Prompt-Embedding Inversion for Text-Based Editing
StyleDiffusion: Prompt-Embedding Inversion for Text-Based Editing
Senmao Li
Joost van de Weijer
Taihang Hu
F. Khan
Qibin Hou
Yaxing Wang
Jian Yang
DiffM
6
51
0
28 Mar 2023
Text2Tex: Text-driven Texture Synthesis via Diffusion Models
Text2Tex: Text-driven Texture Synthesis via Diffusion Models
Dave Zhenyu Chen
Yawar Siddiqui
Hsin-Ying Lee
Sergey Tulyakov
Matthias Nießner
DiffM
20
185
0
20 Mar 2023
Retrieving Multimodal Information for Augmented Generation: A Survey
Retrieving Multimodal Information for Augmented Generation: A Survey
Ruochen Zhao
Hailin Chen
Weishi Wang
Fangkai Jiao
Do Xuan Long
...
Bosheng Ding
Xiaobao Guo
Minzhi Li
Xingxuan Li
Shafiq R. Joty
11
80
0
20 Mar 2023
P+: Extended Textual Conditioning in Text-to-Image Generation
P+: Extended Textual Conditioning in Text-to-Image Generation
A. Voynov
Qinghao Chu
Daniel Cohen-Or
Kfir Aberman
VLM
DiffM
12
116
0
16 Mar 2023
SpectralCLIP: Preventing Artifacts in Text-Guided Style Transfer from a
  Spectral Perspective
SpectralCLIP: Preventing Artifacts in Text-Guided Style Transfer from a Spectral Perspective
Zipeng Xu
Songlong Xing
E. Sangineto
N. Sebe
CLIP
9
1
0
16 Mar 2023
Automatic Geo-alignment of Artwork in Children's Story Books
Automatic Geo-alignment of Artwork in Children's Story Books
Jakub J Dylag
V. Suarez
James Wald
Aneesha Amodini Uvara
DiffM
19
0
0
16 Mar 2023
Aerial Diffusion: Text Guided Ground-to-Aerial View Translation from a
  Single Image using Diffusion Models
Aerial Diffusion: Text Guided Ground-to-Aerial View Translation from a Single Image using Diffusion Models
D. Kothandaraman
Tianyi Zhou
Ming Lin
Dinesh Manocha
14
5
0
15 Mar 2023
Zero-Shot Contrastive Loss for Text-Guided Diffusion Image Style
  Transfer
Zero-Shot Contrastive Loss for Text-Guided Diffusion Image Style Transfer
Serin Yang
Hyunmin Hwang
Jong Chul Ye
DiffM
73
53
0
15 Mar 2023
Architext: Language-Driven Generative Architecture Design
Architext: Language-Driven Generative Architecture Design
Theodoros Galanos
Antonios Liapis
Georgios N. Yannakakis
VLM
AI4CE
8
6
0
13 Mar 2023
Prompting AI Art: An Investigation into the Creative Skill of Prompt
  Engineering
Prompting AI Art: An Investigation into the Creative Skill of Prompt Engineering
J. Oppenlaender
Rhema Linder
Johanna M. Silvennoinen
8
39
0
13 Mar 2023
Transformer-based Image Generation from Scene Graphs
Transformer-based Image Generation from Scene Graphs
Renato Sortino
S. Palazzo
C. Spampinato
ViT
25
9
0
08 Mar 2023
IPA-CLIP: Integrating Phonetic Priors into Vision and Language
  Pretraining
IPA-CLIP: Integrating Phonetic Priors into Vision and Language Pretraining
Chihaya Matsuhira
Marc A. Kastner
Takahiro Komamizu
Takatsugu Hirayama
Keisuke Doman
Yasutomo Kawanishi
Ichiro Ide
21
6
0
06 Mar 2023
TextIR: A Simple Framework for Text-based Editable Image Restoration
TextIR: A Simple Framework for Text-based Editable Image Restoration
Yun-Hao Bai
Cairong Wang
Shuzhao Xie
Chao Dong
Chun Yuan
Zhi Wang
DiffM
16
13
0
28 Feb 2023
Prompt Stealing Attacks Against Text-to-Image Generation Models
Prompt Stealing Attacks Against Text-to-Image Generation Models
Xinyue Shen
Y. Qu
Michael Backes
Yang Zhang
11
20
0
20 Feb 2023
Affect-Conditioned Image Generation
Affect-Conditioned Image Generation
F. Ibarrola
R. Lulham
Kazjon Grace
DiffM
14
2
0
20 Feb 2023
RePrompt: Automatic Prompt Editing to Refine AI-Generative Art Towards
  Precise Expressions
RePrompt: Automatic Prompt Editing to Refine AI-Generative Art Towards Precise Expressions
Yunlong Wang
Shuyuan Shen
Brian Y. Lim
20
53
0
19 Feb 2023
PRedItOR: Text Guided Image Editing with Diffusion Prior
PRedItOR: Text Guided Image Editing with Diffusion Prior
Hareesh Ravi
Sachin Kelkar
Midhun Harikumar
Ajinkya Kale
DiffM
42
10
0
15 Feb 2023
ConceptFusion: Open-set Multimodal 3D Mapping
ConceptFusion: Open-set Multimodal 3D Mapping
Krishna Murthy Jatavallabhula
Ali Kuwajerwala
Qiao Gu
Mohd. Omama
Tao Chen
...
Celso Miguel de Melo
Madhava Krishna
Liam Paull
Florian Shkurti
Antonio Torralba
6
152
0
14 Feb 2023
Multi-modal Machine Learning in Engineering Design: A Review and Future
  Directions
Multi-modal Machine Learning in Engineering Design: A Review and Future Directions
Binyang Song
Ruilin Zhou
Faez Ahmed
AI4CE
26
38
0
14 Feb 2023
Zero-shot Generation of Coherent Storybook from Plain Text Story using
  Diffusion Models
Zero-shot Generation of Coherent Storybook from Plain Text Story using Diffusion Models
Hyeonho Jeong
Gihyun Kwon
Jong Chul Ye
16
16
0
08 Feb 2023
Attend-and-Excite: Attention-Based Semantic Guidance for Text-to-Image
  Diffusion Models
Attend-and-Excite: Attention-Based Semantic Guidance for Text-to-Image Diffusion Models
Hila Chefer
Yuval Alaluf
Yael Vinker
Lior Wolf
Daniel Cohen-Or
DiffM
12
333
0
31 Jan 2023
Shape-aware Text-driven Layered Video Editing
Shape-aware Text-driven Layered Video Editing
Yao-Chih Lee
Ji-Ze Jang
Yi-Ting Chen
Elizabeth Qiu
Jia-Bin Huang
VGen
DiffM
23
51
0
30 Jan 2023
StyleGAN-T: Unlocking the Power of GANs for Fast Large-Scale
  Text-to-Image Synthesis
StyleGAN-T: Unlocking the Power of GANs for Fast Large-Scale Text-to-Image Synthesis
Axel Sauer
Tero Karras
S. Laine
Andreas Geiger
Timo Aila
16
151
0
23 Jan 2023
T2M-GPT: Generating Human Motion from Textual Descriptions with Discrete
  Representations
T2M-GPT: Generating Human Motion from Textual Descriptions with Discrete Representations
Jianrong Zhang
Yangsong Zhang
Xiaodong Cun
Shaoli Huang
Yong Zhang
Hongwei Zhao
Hongtao Lu
Xiaodong Shen
20
311
0
15 Jan 2023
Diatom-inspired architected materials using language-based deep
  learning: Perception, transformation and manufacturing
Diatom-inspired architected materials using language-based deep learning: Perception, transformation and manufacturing
Markus J. Buehler
AI4CE
8
4
0
14 Jan 2023
ANNA: Abstractive Text-to-Image Synthesis with Filtered News Captions
ANNA: Abstractive Text-to-Image Synthesis with Filtered News Captions
Aashish Anantha Ramakrishnan
Sharon X. Huang
Dongwon Lee
14
1
0
05 Jan 2023
Attribute-Centric Compositional Text-to-Image Generation
Attribute-Centric Compositional Text-to-Image Generation
Yuren Cong
Martin Renqiang Min
Erran L. Li
Bodo Rosenhahn
M. Yang
41
11
0
04 Jan 2023
Stroke-based Rendering: From Heuristics to Deep Learning
Stroke-based Rendering: From Heuristics to Deep Learning
Florian Nolte
Andrew Melnik
Helge J. Ritter
GAN
20
4
0
30 Dec 2022
Contrastive Language-Vision AI Models Pretrained on Web-Scraped
  Multimodal Data Exhibit Sexual Objectification Bias
Contrastive Language-Vision AI Models Pretrained on Web-Scraped Multimodal Data Exhibit Sexual Objectification Bias
Robert Wolfe
Yiwei Yang
Billy Howe
Aylin Caliskan
DiffM
8
40
0
21 Dec 2022
Distilling Vision-Language Pre-training to Collaborate with
  Weakly-Supervised Temporal Action Localization
Distilling Vision-Language Pre-training to Collaborate with Weakly-Supervised Temporal Action Localization
Chen Ju
Kunhao Zheng
Jinxian Liu
Peisen Zhao
Ya-Qin Zhang
Jianlong Chang
Yanfeng Wang
Qi Tian
11
7
0
19 Dec 2022
Text-Guided Mask-free Local Image Retouching
Text-Guided Mask-free Local Image Retouching
Zerun Liu
Fan Zhang
Jingxuan He
Jin Wang
Zhangye Wang
Lechao Cheng
DiffM
14
2
0
15 Dec 2022
Artificial Intelligence for Health Message Generation: Theory, Method,
  and an Empirical Study Using Prompt Engineering
Artificial Intelligence for Health Message Generation: Theory, Method, and an Empirical Study Using Prompt Engineering
Sue Lim
Ralf Schmälzle
8
31
0
14 Dec 2022
Task Bias in Vision-Language Models
Task Bias in Vision-Language Models
Sachit Menon
I. Chandratreya
Carl Vondrick
VLM
SSL
4
4
0
08 Dec 2022
M-VADER: A Model for Diffusion with Multimodal Context
M-VADER: A Model for Diffusion with Multimodal Context
Samuel Weinbach
Marco Bellagente
C. Eichenberg
Andrew M. Dai
R. Baldock
Souradeep Nanda
Bjorn Deiseroth
Koen Oostermeijer
H. Teufel
Andres Felipe Cruz Salinas
DiffM
27
11
0
06 Dec 2022
SpaText: Spatio-Textual Representation for Controllable Image Generation
SpaText: Spatio-Textual Representation for Controllable Image Generation
Omri Avrahami
Thomas Hayes
Oran Gafni
Sonal Gupta
Yaniv Taigman
Devi Parikh
Dani Lischinski
Ohad Fried
Xiaoyue Yin
DiffM
19
149
0
25 Nov 2022
Inversion-Based Style Transfer with Diffusion Models
Inversion-Based Style Transfer with Diffusion Models
Yu-xin Zhang
Nisha Huang
Fan Tang
Haibin Huang
Chongyang Ma
Weiming Dong
Changsheng Xu
DiffM
12
253
0
23 Nov 2022
Mitigating and Evaluating Static Bias of Action Representations in the
  Background and the Foreground
Mitigating and Evaluating Static Bias of Action Representations in the Background and the Foreground
Haoxin Li
Yuan Liu
Hanwang Zhang
Boyang Li
17
15
0
23 Nov 2022
Previous
123456
Next