Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2104.08718
Cited By
v1
v2
v3 (latest)
CLIPScore: A Reference-free Evaluation Metric for Image Captioning
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2021
18 April 2021
Jack Hessel
Ari Holtzman
Maxwell Forbes
Ronan Le Bras
Yejin Choi
CLIP
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"CLIPScore: A Reference-free Evaluation Metric for Image Captioning"
50 / 1,489 papers shown
MOSAIC: Multi-Object Segmented Arbitrary Stylization Using CLIP
Prajwal Ganugula
Y. Kumar
N. Reddy
Prabhath Chellingi
A. Thakur
Neeraj Kasera
C. S. Anand
CLIP
DiffM
167
4
0
24 Sep 2023
ContextRef: Evaluating Referenceless Metrics For Image Description Generation
International Conference on Learning Representations (ICLR), 2023
Elisa Kreiss
E. Zelikman
Christopher Potts
Nick Haber
246
5
0
21 Sep 2023
Language-driven Object Fusion into Neural Radiance Fields with Pose-Conditioned Dataset Updates
Computer Vision and Pattern Recognition (CVPR), 2023
Kashun Shum
Jaeyeon Kim
Binh-Son Hua
Duc Thanh Nguyen
Sai-Kit Yeung
3DH
AI4CE
227
10
0
20 Sep 2023
Guide Your Agent with Adaptive Multimodal Rewards
Neural Information Processing Systems (NeurIPS), 2023
Changyeon Kim
Younggyo Seo
Hao Liu
Lisa Lee
Jinwoo Shin
Honglak Lee
Kimin Lee
355
11
0
19 Sep 2023
Forgedit: Text Guided Image Editing via Learning and Forgetting
Shiwen Zhang
Shuai Xiao
Weilin Huang
DiffM
228
29
0
19 Sep 2023
What is the Best Automated Metric for Text to Motion Generation?
ACM SIGGRAPH Conference and Exhibition on Computer Graphics and Interactive Techniques in Asia (SIGGRAPH Asia), 2023
Jordan Voas
Yili Wang
Qixing Huang
Raymond Mooney
EGVM
276
17
0
19 Sep 2023
Market-GAN: Adding Control to Financial Market Data Generation with Semantic Context
AAAI Conference on Artificial Intelligence (AAAI), 2023
Haochong Xia
Shuo Sun
Xinrun Wang
Bo An
AIFin
244
14
0
14 Sep 2023
Language Models as Black-Box Optimizers for Vision-Language Models
Computer Vision and Pattern Recognition (CVPR), 2023
Shihong Liu
Zhiqiu Lin
Samuel Yu
Ryan Lee
Tiffany Ling
Deepak Pathak
Deva Ramanan
VLM
411
42
0
12 Sep 2023
Prompting4Debugging: Red-Teaming Text-to-Image Diffusion Models by Finding Problematic Prompts
International Conference on Machine Learning (ICML), 2023
Zhi-Yi Chin
Chieh-Ming Jiang
Ching-Chun Huang
Pin-Yu Chen
Wei-Chen Chiu
DiffM
371
123
0
12 Sep 2023
Prefix-diffusion: A Lightweight Diffusion Model for Diverse Image Captioning
International Conference on Language Resources and Evaluation (LREC), 2023
Guisheng Liu
Yi Li
Zhengcong Fei
Haiyan Fu
Xiangyang Luo
Yanqing Guo
VLM
DiffM
266
16
0
10 Sep 2023
Exploring Sparse MoE in GANs for Text-conditioned Image Synthesis
Computer Vision and Pattern Recognition (CVPR), 2023
Jiapeng Zhu
Ceyuan Yang
Kecheng Zheng
Yinghao Xu
Zifan Shi
Yujun Shen
MoE
262
14
0
07 Sep 2023
Chasing Consistency in Text-to-3D Generation from a Single Image
Yichen Ouyang
Wenhao Chai
Jiayi Ye
Dapeng Tao
Yibing Zhan
Gaoang Wang
DiffM
206
16
0
07 Sep 2023
Generating Realistic Images from In-the-wild Sounds
IEEE International Conference on Computer Vision (ICCV), 2023
Taegyeong Lee
Jeonghun Kang
Hyeonyu Kim
Taehwan Kim
DiffM
256
11
0
05 Sep 2023
ControlMat: A Controlled Generative Approach to Material Capture
ACM Transactions on Graphics (TOG), 2023
Giuseppe Vecchio
Rosalie Martin
Arthur Roullier
Adrien Kaiser
Romain Rouffet
Valentin Deschaintre
T. Boubekeur
DiffM
256
64
0
04 Sep 2023
Exploring Limits of Diffusion-Synthetic Training with Weakly Supervised Semantic Segmentation
Asian Conference on Computer Vision (ACCV), 2023
Ryota Yoshihashi
Yuya Otsuka
Kenji Doi
Tomohiro Tanaka
Hirokatsu Kataoka
469
4
0
04 Sep 2023
RenAIssance: A Survey into AI Text-to-Image Generation in the Era of Large Model
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
Fengxiang Bie
Jianlong Wu
Zhongzhu Zhou
Adam Ghanem
Minjia Zhang
...
Pareesa Ameneh Golnari
David A. Clifton
Yuxiong He
Dacheng Tao
Shuaiwen Leon Song
EGVM
256
58
0
02 Sep 2023
Socratis: Are large multimodal models emotionally aware?
Katherine Deng
Arijit Ray
Reuben Tan
Saadia Gabriel
Bryan A. Plummer
Kate Saenko
344
8
0
31 Aug 2023
Dysen-VDM: Empowering Dynamics-aware Text-to-Video Diffusion with LLMs
Computer Vision and Pattern Recognition (CVPR), 2023
Hao Fei
Shengqiong Wu
Wei Ji
Hanwang Zhang
Tat-Seng Chua
VGen
DiffM
220
45
0
26 Aug 2023
Dense Text-to-Image Generation with Attention Modulation
IEEE International Conference on Computer Vision (ICCV), 2023
Yunji Kim
Jiyoung Lee
Jin-Hwa Kim
Jung-Woo Ha
Jun-Yan Zhu
DiffM
317
182
0
24 Aug 2023
With a Little Help from your own Past: Prototypical Memory Networks for Image Captioning
IEEE International Conference on Computer Vision (ICCV), 2023
Manuele Barraco
Sara Sarto
Marcella Cornia
Lorenzo Baraldi
Rita Cucchiara
VLM
186
31
0
23 Aug 2023
CgT-GAN: CLIP-guided Text GAN for Image Captioning
ACM Multimedia (ACM MM), 2023
Jiarui Yu
Haoran Li
Y. Hao
B. Zhu
Tong Xu
Xiangnan He
VLM
CLIP
229
24
0
23 Aug 2023
MusicJam: Visualizing Music Insights via Generated Narrative Illustrations
Communications in Information and Systems (CIS), 2023
Chuer Chen
Nan Cao
Jiani Hou
Yi Guo
Yulei Zhang
Yang Shi
DiffM
201
1
0
22 Aug 2023
DiffCloth: Diffusion Based Garment Synthesis and Manipulation via Structural Cross-modal Semantic Alignment
IEEE International Conference on Computer Vision (ICCV), 2023
Xujie Zhang
Binbin Yang
Michael C. Kampffmeyer
Wenqing Zhang
Shiyue Zhang
Guansong Lu
Liang Lin
Hang Xu
Xiaodan Liang
DiffM
396
17
0
22 Aug 2023
Generic Attention-model Explainability by Weighted Relevance Accumulation
ACM Multimedia Asia (MA), 2023
Yiming Huang
Ao Jia
Xiaodan Zhang
Jiawei Zhang
156
4
0
20 Aug 2023
AltDiffusion: A Multilingual Text-to-Image Diffusion Model
AAAI Conference on Artificial Intelligence (AAAI), 2023
Fulong Ye
Guangyi Liu
Xinya Wu
Ledell Yu Wu
VLM
309
47
0
19 Aug 2023
DUAW: Data-free Universal Adversarial Watermark against Stable Diffusion Customization
Xiaoyu Ye
Hao Huang
Jiaqi An
Yongtao Wang
WIGM
226
26
0
19 Aug 2023
DiffDis: Empowering Generative Diffusion Model with Cross-Modal Discrimination Capability
IEEE International Conference on Computer Vision (ICCV), 2023
Runhu Huang
Jianhua Han
Guansong Lu
Xiaodan Liang
Yihan Zeng
Wei Zhang
Hang Xu
DiffM
171
8
0
18 Aug 2023
Learning to Generate Semantic Layouts for Higher Text-Image Correspondence in Text-to-Image Synthesis
IEEE International Conference on Computer Vision (ICCV), 2023
Minho Park
Jooyeol Yun
Seunghwan Choi
Jaegul Choo
DiffM
183
12
0
16 Aug 2023
IP-Adapter: Text Compatible Image Prompt Adapter for Text-to-Image Diffusion Models
Hu Ye
Jun Zhang
Siyi Liu
Xiao Han
Wei Yang
DiffM
323
1,282
0
13 Aug 2023
DIG In: Evaluating Disparities in Image Generations with Indicators for Geographic Diversity
Melissa Hall
Candace Ross
Adina Williams
Nicolas Carion
M. Drozdzal
Adriana Romero Soriano
EGVM
362
9
0
11 Aug 2023
The Five-Dollar Model: Generating Game Maps and Sprites from Sentence Embeddings
Artificial Intelligence and Interactive Digital Entertainment Conference (AIIDE), 2023
Timothy Merino
Roman Negri
Dipika Rajesh
M. Charity
Julian Togelius
DiffM
VLM
165
19
0
08 Aug 2023
Learning Concise and Descriptive Attributes for Visual Recognition
IEEE International Conference on Computer Vision (ICCV), 2023
Andy Yan
Yu Wang
Yiwu Zhong
Chengyu Dong
Zexue He
Yujie Lu
William Wang
Jingbo Shang
Julian McAuley
VLM
297
85
0
07 Aug 2023
Food-500 Cap: A Fine-Grained Food Caption Benchmark for Evaluating Vision-Language Models
ACM Multimedia (ACM MM), 2023
Zheng Ma
Mianzhi Pan
Wenhan Wu
Ka Leong Cheng
Jianbing Zhang
Shujian Huang
Jiajun Chen
VLM
CoGe
237
8
0
06 Aug 2023
Multimodal Neurons in Pretrained Text-Only Transformers
Sarah Schwettmann
Neil Chowdhury
Samuel J. Klein
David Bau
Antonio Torralba
MILM
275
43
0
03 Aug 2023
Reverse Stable Diffusion: What prompt was used to generate this image?
Computer Vision and Image Understanding (CVIU), 2023
Florinel-Alin Croitoru
Vlad Hondru
Radu Tudor Ionescu
M. Shah
VLM
DiffM
276
11
0
02 Aug 2023
Guiding Image Captioning Models Toward More Specific Captions
IEEE International Conference on Computer Vision (ICCV), 2023
Simon Kornblith
Lala Li
Zirui Wang
Thao Nguyen
320
21
0
31 Jul 2023
Visual Captioning at Will: Describing Images and Videos Guided by a Few Stylized Sentences
ACM Multimedia (ACM MM), 2023
Di Yang
Hongyu Chen
Xinglin Hou
Bo Xiao
Yuning Jiang
Qin Jin
241
8
0
31 Jul 2023
UniBriVL: Robust Universal Representation and Generation of Audio Driven Diffusion Models
Sen Fang
Bowen Gao
Yangjian Wu
T. Teoh
DiffM
227
1
0
29 Jul 2023
Exploring Annotation-free Image Captioning with Retrieval-augmented Pseudo Sentence Generation
ACM Multimedia Asia (MA), 2023
Zhiyuan Li
Dongnan Liu
Heng Wang
Chaoyi Zhang
Weidong (Tom) Cai
RALM
199
2
0
27 Jul 2023
Improving Multimodal Datasets with Image Captioning
Neural Information Processing Systems (NeurIPS), 2023
Thao Nguyen
S. Gadre
Gabriel Ilharco
Sewoong Oh
Ludwig Schmidt
VLM
263
125
0
19 Jul 2023
Text2Layer: Layered Image Generation using Latent Diffusion Model
Xinyang Zhang
Wentian Zhao
Xin Lu
J. Chien
DiffM
196
27
0
19 Jul 2023
Let's ViCE! Mimicking Human Cognitive Behavior in Image Generation Evaluation
ACM Multimedia (ACM MM), 2023
Federico Betti
Jacopo Staiano
Lorenzo Baraldi
Lorenzo Baraldi
Rita Cucchiara
Andrii Zadaianchuk
EGVM
154
12
0
18 Jul 2023
Towards Safe Self-Distillation of Internet-Scale Text-to-Image Diffusion Models
Sanghyun Kim
Seohyeong Jung
Balhae Kim
Moonseok Choi
Jinwoo Shin
Juho Lee
DiffM
140
37
0
12 Jul 2023
Divide, Evaluate, and Refine: Evaluating and Improving Text-to-Image Alignment with Iterative VQA Feedback
Neural Information Processing Systems (NeurIPS), 2023
Jaskirat Singh
Liang Zheng
306
39
0
10 Jul 2023
Linear Alignment of Vision-language Models for Image Captioning
Fabian Paischer
M. Hofmarcher
Sepp Hochreiter
Thomas Adler
CLIP
VLM
486
2
0
10 Jul 2023
CLIPAG: Towards Generator-Free Text-to-Image Generation
IEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2023
Roy Ganz
Michael Elad
VLM
227
15
0
29 Jun 2023
Self-Supervised Image Captioning with CLIP
Chuanyang Jin
VLM
SSL
210
3
0
26 Jun 2023
Restart Sampling for Improving Generative Processes
Neural Information Processing Systems (NeurIPS), 2023
Yilun Xu
Mingyang Deng
Xiang Cheng
Yonglong Tian
Ziming Liu
Tommi Jaakkola
DiffM
VLM
315
78
0
26 Jun 2023
Learning Descriptive Image Captioning via Semipermeable Maximum Likelihood Estimation
Neural Information Processing Systems (NeurIPS), 2023
Zihao Yue
Anwen Hu
Liang Zhang
Qin Jin
350
7
0
23 Jun 2023
Listener Model for the PhotoBook Referential Game with CLIPScores as Implicit Reference Chain
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Shih-Lun Wu
Yi-Hui Chou
Liang Li
152
0
0
16 Jun 2023
Previous
1
2
3
...
25
26
27
28
29
30
Next
Page 26 of 30
Page
of 30
Go