Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2108.02818
Cited By
Evaluating CLIP: Towards Characterization of Broader Capabilities and Downstream Implications
5 August 2021
Sandhini Agarwal
Gretchen Krueger
Jack Clark
Alec Radford
Jong Wook Kim
Miles Brundage
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Evaluating CLIP: Towards Characterization of Broader Capabilities and Downstream Implications"
45 / 95 papers shown
Title
Composed Image Retrieval using Contrastive Learning and Task-oriented CLIP-based Features
Alberto Baldrati
Marco Bertini
Tiberio Uricchio
A. Bimbo
CLIP
CoGe
69
35
0
22 Aug 2023
PerceptionCLIP: Visual Classification by Inferring and Conditioning on Contexts
Bang An
Sicheng Zhu
Michael-Andrei Panaitescu-Liess
Chaithanya Kumar Mummadi
Furong Huang
VLM
95
8
0
02 Aug 2023
Distilling Knowledge from Text-to-Image Generative Models Improves Visio-Linguistic Reasoning in CLIP
S. Basu
S. Hu
Maziar Sanjabi
Daniela Massiceti
Soheil Feizi
VLM
46
4
0
18 Jul 2023
What a MESS: Multi-Domain Evaluation of Zero-Shot Semantic Segmentation
Benedikt Blumenstiel
Johannes Jakubik
Hilde Kuhne
Michael Vossing
VLM
129
18
0
27 Jun 2023
VisoGender: A dataset for benchmarking gender bias in image-text pronoun resolution
Elizaveta Semenova
F. G. Abrantes
Hanwen Zhu
Grace A. Sodunke
Aleksandar Shtedritski
Hannah Rose Kirk
CoGe
125
46
0
21 Jun 2023
RS5M and GeoRSCLIP: A Large Scale Vision-Language Dataset and A Large Vision-Language Model for Remote Sensing
Zilun Zhang
Tiancheng Zhao
Yulong Guo
Yuxiang Cai
DiffM
VLM
159
66
0
20 Jun 2023
Modular Visual Question Answering via Code Generation
Sanjay Subramanian
Medhini Narasimhan
Kushal Khangaonkar
Kevin Kaichuang Yang
Arsha Nagrani
Cordelia Schmid
Andy Zeng
Trevor Darrell
Dan Klein
77
51
0
08 Jun 2023
Semantically-Prompted Language Models Improve Visual Descriptions
Michael Ogezi
B. Hauer
Grzegorz Kondrak
VLM
61
0
0
05 Jun 2023
Balancing the Picture: Debiasing Vision-Language Datasets with Synthetic Contrast Sets
Brandon Smith
Miguel Farinha
Elizaveta Semenova
Hannah Rose Kirk
Aleksandar Shtedritski
Max Bain
90
19
0
24 May 2023
Gender Biases in Automatic Evaluation Metrics for Image Captioning
Haoyi Qiu
Zi-Yi Dou
Tianlu Wang
Asli Celikyilmaz
Nanyun Peng
EGVM
119
16
0
24 May 2023
Learning the Visualness of Text Using Large Vision-Language Models
Gaurav Verma
Ryan Rossi
Chris Tensmeyer
Jiuxiang Gu
A. Nenkova
VLM
71
0
0
11 May 2023
What does CLIP know about a red circle? Visual prompt engineering for VLMs
Aleksandar Shtedritski
Christian Rupprecht
Andrea Vedaldi
VLM
MLLM
113
162
0
13 Apr 2023
Mitigating Spurious Correlations in Multi-modal Models during Fine-tuning
Yu Yang
Besmira Nushi
Hamid Palangi
Baharan Mirzasoleiman
103
39
0
08 Apr 2023
Self-Supervised Multimodal Learning: A Survey
Yongshuo Zong
Oisin Mac Aodha
Timothy M. Hospedales
SSL
125
50
0
31 Mar 2023
MultiModal Bias: Introducing a Framework for Stereotypical Bias Assessment beyond Gender and Race in Vision Language Models
Sepehr Janghorbani
Gerard de Melo
VLM
106
12
0
16 Mar 2023
Automatic Geo-alignment of Artwork in Children's Story Books
Jakub J Dylag
V. Suarez
James Wald
Aneesha Amodini Uvara
DiffM
67
0
0
16 Mar 2023
Text2Face: A Multi-Modal 3D Face Model
William Rowan
P. Huber
Nick E. Pears
Andrew Keeling
3DH
59
3
0
05 Mar 2023
Towards Reliable Assessments of Demographic Disparities in Multi-Label Image Classifiers
Melissa Hall
Bobbie Chern
Laura Gustafson
Denisse Ventura
Harshad Kulkarni
Candace Ross
Nicolas Usunier
71
6
0
16 Feb 2023
A Friendly Face: Do Text-to-Image Systems Rely on Stereotypes when the Input is Under-Specified?
Kathleen C. Fraser
S. Kiritchenko
I. Nejadgholi
DiffM
80
38
0
14 Feb 2023
Auditing Gender Presentation Differences in Text-to-Image Models
Yanzhe Zhang
Lu Jiang
Greg Turk
Diyi Yang
EGVM
90
24
0
07 Feb 2023
Debiasing Vision-Language Models via Biased Prompts
Ching-Yao Chuang
Varun Jampani
Yuanzhen Li
Antonio Torralba
Stefanie Jegelka
VLM
122
107
0
31 Jan 2023
Discovering and Mitigating Visual Biases through Keyword Explanation
Younghyun Kim
Sangwoo Mo
Minkyu Kim
Kyungmin Lee
Jaeho Lee
Jinwoo Shin
165
34
0
26 Jan 2023
Vision-Language Models Performing Zero-Shot Tasks Exhibit Gender-based Disparities
Melissa Hall
Laura Gustafson
Aaron B. Adcock
Ishan Misra
Candace Ross
VLM
101
24
0
26 Jan 2023
Contrastive Language-Vision AI Models Pretrained on Web-Scraped Multimodal Data Exhibit Sexual Objectification Bias
Robert Wolfe
Yiwei Yang
Billy Howe
Aylin Caliskan
DiffM
133
57
0
21 Dec 2022
Improving Zero-Shot Models with Label Distribution Priors
Jonathan Kahana
Niv Cohen
Yedid Hoshen
VLM
138
14
0
01 Dec 2022
Zero-shot Image Captioning by Anchor-augmented Vision-Language Space Alignment
Junyan Wang
Yi Zhang
Ming Yan
Ji Zhang
Jitao Sang
VLM
64
9
0
14 Nov 2022
Evaluating and Improving Factuality in Multimodal Abstractive Summarization
David Wan
Joey Tianyi Zhou
93
10
0
04 Nov 2022
Masked Vision-Language Transformer in Fashion
Ge-Peng Ji
Mingchen Zhuge
D. Gao
Deng-Ping Fan
Daniel Gehrig
Luc Van Gool
90
25
0
27 Oct 2022
FairCLIP: Social Bias Elimination based on Attribute Prototype Learning and Representation Neutralization
Junyan Wang
Yi Zhang
Jitao Sang
FaML
VLM
102
24
0
26 Oct 2022
Neural Eigenfunctions Are Structured Representation Learners
Zhijie Deng
Jiaxin Shi
Hao Zhang
Peng Cui
Cewu Lu
Jun Zhu
109
14
0
23 Oct 2022
Self-supervised debiasing using low rank regularization
Geon Yeong Park
Chanyong Jung
Sangmin Lee
Jong Chul Ye
Sang Wan Lee
CML
SSL
106
4
0
11 Oct 2022
American == White in Multimodal Language-and-Image AI
Robert Wolfe
Aylin Caliskan
VLM
90
51
0
01 Jul 2022
Know your audience: specializing grounded language models with listener subtraction
Aaditya K. Singh
David Ding
Andrew M. Saxe
Felix Hill
Andrew Kyle Lampinen
65
2
0
16 Jun 2022
DORA: Exploring Outlier Representations in Deep Neural Networks
Kirill Bykov
Mayukh Deb
Dennis Grinwald
Klaus-Robert Muller
Marina M.-C. Höhne
123
13
0
09 Jun 2022
Markedness in Visual Semantic AI
Robert Wolfe
Aylin Caliskan
VLM
112
36
0
23 May 2022
Evidence for Hypodescent in Visual Semantic AI
Robert Wolfe
M. Banaji
Aylin Caliskan
VLM
96
38
0
22 May 2022
A CLIP-Hitchhiker's Guide to Long Video Retrieval
Max Bain
Arsha Nagrani
Gül Varol
Andrew Zisserman
CLIP
191
62
0
17 May 2022
ReCLIP: A Strong Zero-Shot Baseline for Referring Expression Comprehension
Sanjay Subramanian
William Merrill
Trevor Darrell
Matt Gardner
Sameer Singh
Anna Rohrbach
ObjD
114
128
0
12 Apr 2022
A Prompt Array Keeps the Bias Away: Debiasing Vision-Language Models with Adversarial Learning
Hugo Elias Berg
Elizaveta Semenova
Yash Bhalgat
Wonsuk Yang
Hannah Rose Kirk
Aleksandar Shtedritski
Max Bain
VLM
108
101
0
22 Mar 2022
Contrastive Visual Semantic Pretraining Magnifies the Semantics of Natural Language Representations
Robert Wolfe
Aylin Caliskan
VLM
67
14
0
14 Mar 2022
Scaling Open-Vocabulary Image Segmentation with Image-Level Labels
Golnaz Ghiasi
Xiuye Gu
Huayu Chen
Nayeon Lee
VLM
188
387
0
22 Dec 2021
Twitter-COMMs: Detecting Climate, COVID, and Military Multimodal Misinformation
Giscard Biamby
Grace Luo
Trevor Darrell
Anna Rohrbach
68
26
0
16 Dec 2021
Text2Mesh: Text-Driven Neural Stylization for Meshes
O. Michel
Roi Bar-On
Richard Liu
Sagie Benaim
Rana Hanocka
CLIP
AI4CE
280
362
0
06 Dec 2021
CLOOB: Modern Hopfield Networks with InfoLOOB Outperform CLIP
Andreas Fürst
Elisabeth Rumetshofer
Johannes Lehner
Viet-Hung Tran
Fei Tang
...
David P. Kreil
Michael K Kopp
Günter Klambauer
Angela Bitto-Nemling
Sepp Hochreiter
VLM
CLIP
314
104
0
21 Oct 2021
CLIPScore: A Reference-free Evaluation Metric for Image Captioning
Jack Hessel
Ari Holtzman
Maxwell Forbes
Ronan Le Bras
Yejin Choi
CLIP
269
1,597
0
18 Apr 2021
Previous
1
2