ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2108.02818
  4. Cited By
Evaluating CLIP: Towards Characterization of Broader Capabilities and
  Downstream Implications

Evaluating CLIP: Towards Characterization of Broader Capabilities and Downstream Implications

5 August 2021
Sandhini Agarwal
Gretchen Krueger
Jack Clark
Alec Radford
Jong Wook Kim
Miles Brundage
ArXiv (abs)PDFHTML

Papers citing "Evaluating CLIP: Towards Characterization of Broader Capabilities and Downstream Implications"

45 / 95 papers shown
Title
Composed Image Retrieval using Contrastive Learning and Task-oriented
  CLIP-based Features
Composed Image Retrieval using Contrastive Learning and Task-oriented CLIP-based Features
Alberto Baldrati
Marco Bertini
Tiberio Uricchio
A. Bimbo
CLIPCoGe
69
35
0
22 Aug 2023
PerceptionCLIP: Visual Classification by Inferring and Conditioning on
  Contexts
PerceptionCLIP: Visual Classification by Inferring and Conditioning on Contexts
Bang An
Sicheng Zhu
Michael-Andrei Panaitescu-Liess
Chaithanya Kumar Mummadi
Furong Huang
VLM
95
8
0
02 Aug 2023
Distilling Knowledge from Text-to-Image Generative Models Improves
  Visio-Linguistic Reasoning in CLIP
Distilling Knowledge from Text-to-Image Generative Models Improves Visio-Linguistic Reasoning in CLIP
S. Basu
S. Hu
Maziar Sanjabi
Daniela Massiceti
Soheil Feizi
VLM
46
4
0
18 Jul 2023
What a MESS: Multi-Domain Evaluation of Zero-Shot Semantic Segmentation
What a MESS: Multi-Domain Evaluation of Zero-Shot Semantic Segmentation
Benedikt Blumenstiel
Johannes Jakubik
Hilde Kuhne
Michael Vossing
VLM
129
18
0
27 Jun 2023
VisoGender: A dataset for benchmarking gender bias in image-text pronoun
  resolution
VisoGender: A dataset for benchmarking gender bias in image-text pronoun resolution
Elizaveta Semenova
F. G. Abrantes
Hanwen Zhu
Grace A. Sodunke
Aleksandar Shtedritski
Hannah Rose Kirk
CoGe
125
46
0
21 Jun 2023
RS5M and GeoRSCLIP: A Large Scale Vision-Language Dataset and A Large
  Vision-Language Model for Remote Sensing
RS5M and GeoRSCLIP: A Large Scale Vision-Language Dataset and A Large Vision-Language Model for Remote Sensing
Zilun Zhang
Tiancheng Zhao
Yulong Guo
Yuxiang Cai
DiffMVLM
159
66
0
20 Jun 2023
Modular Visual Question Answering via Code Generation
Modular Visual Question Answering via Code Generation
Sanjay Subramanian
Medhini Narasimhan
Kushal Khangaonkar
Kevin Kaichuang Yang
Arsha Nagrani
Cordelia Schmid
Andy Zeng
Trevor Darrell
Dan Klein
77
51
0
08 Jun 2023
Semantically-Prompted Language Models Improve Visual Descriptions
Semantically-Prompted Language Models Improve Visual Descriptions
Michael Ogezi
B. Hauer
Grzegorz Kondrak
VLM
61
0
0
05 Jun 2023
Balancing the Picture: Debiasing Vision-Language Datasets with Synthetic
  Contrast Sets
Balancing the Picture: Debiasing Vision-Language Datasets with Synthetic Contrast Sets
Brandon Smith
Miguel Farinha
Elizaveta Semenova
Hannah Rose Kirk
Aleksandar Shtedritski
Max Bain
90
19
0
24 May 2023
Gender Biases in Automatic Evaluation Metrics for Image Captioning
Gender Biases in Automatic Evaluation Metrics for Image Captioning
Haoyi Qiu
Zi-Yi Dou
Tianlu Wang
Asli Celikyilmaz
Nanyun Peng
EGVM
119
16
0
24 May 2023
Learning the Visualness of Text Using Large Vision-Language Models
Learning the Visualness of Text Using Large Vision-Language Models
Gaurav Verma
Ryan Rossi
Chris Tensmeyer
Jiuxiang Gu
A. Nenkova
VLM
71
0
0
11 May 2023
What does CLIP know about a red circle? Visual prompt engineering for
  VLMs
What does CLIP know about a red circle? Visual prompt engineering for VLMs
Aleksandar Shtedritski
Christian Rupprecht
Andrea Vedaldi
VLMMLLM
113
162
0
13 Apr 2023
Mitigating Spurious Correlations in Multi-modal Models during
  Fine-tuning
Mitigating Spurious Correlations in Multi-modal Models during Fine-tuning
Yu Yang
Besmira Nushi
Hamid Palangi
Baharan Mirzasoleiman
103
39
0
08 Apr 2023
Self-Supervised Multimodal Learning: A Survey
Self-Supervised Multimodal Learning: A Survey
Yongshuo Zong
Oisin Mac Aodha
Timothy M. Hospedales
SSL
125
50
0
31 Mar 2023
MultiModal Bias: Introducing a Framework for Stereotypical Bias
  Assessment beyond Gender and Race in Vision Language Models
MultiModal Bias: Introducing a Framework for Stereotypical Bias Assessment beyond Gender and Race in Vision Language Models
Sepehr Janghorbani
Gerard de Melo
VLM
106
12
0
16 Mar 2023
Automatic Geo-alignment of Artwork in Children's Story Books
Automatic Geo-alignment of Artwork in Children's Story Books
Jakub J Dylag
V. Suarez
James Wald
Aneesha Amodini Uvara
DiffM
67
0
0
16 Mar 2023
Text2Face: A Multi-Modal 3D Face Model
Text2Face: A Multi-Modal 3D Face Model
William Rowan
P. Huber
Nick E. Pears
Andrew Keeling
3DH
59
3
0
05 Mar 2023
Towards Reliable Assessments of Demographic Disparities in Multi-Label
  Image Classifiers
Towards Reliable Assessments of Demographic Disparities in Multi-Label Image Classifiers
Melissa Hall
Bobbie Chern
Laura Gustafson
Denisse Ventura
Harshad Kulkarni
Candace Ross
Nicolas Usunier
71
6
0
16 Feb 2023
A Friendly Face: Do Text-to-Image Systems Rely on Stereotypes when the
  Input is Under-Specified?
A Friendly Face: Do Text-to-Image Systems Rely on Stereotypes when the Input is Under-Specified?
Kathleen C. Fraser
S. Kiritchenko
I. Nejadgholi
DiffM
80
38
0
14 Feb 2023
Auditing Gender Presentation Differences in Text-to-Image Models
Auditing Gender Presentation Differences in Text-to-Image Models
Yanzhe Zhang
Lu Jiang
Greg Turk
Diyi Yang
EGVM
90
24
0
07 Feb 2023
Debiasing Vision-Language Models via Biased Prompts
Debiasing Vision-Language Models via Biased Prompts
Ching-Yao Chuang
Varun Jampani
Yuanzhen Li
Antonio Torralba
Stefanie Jegelka
VLM
122
107
0
31 Jan 2023
Discovering and Mitigating Visual Biases through Keyword Explanation
Discovering and Mitigating Visual Biases through Keyword Explanation
Younghyun Kim
Sangwoo Mo
Minkyu Kim
Kyungmin Lee
Jaeho Lee
Jinwoo Shin
165
34
0
26 Jan 2023
Vision-Language Models Performing Zero-Shot Tasks Exhibit Gender-based
  Disparities
Vision-Language Models Performing Zero-Shot Tasks Exhibit Gender-based Disparities
Melissa Hall
Laura Gustafson
Aaron B. Adcock
Ishan Misra
Candace Ross
VLM
101
24
0
26 Jan 2023
Contrastive Language-Vision AI Models Pretrained on Web-Scraped
  Multimodal Data Exhibit Sexual Objectification Bias
Contrastive Language-Vision AI Models Pretrained on Web-Scraped Multimodal Data Exhibit Sexual Objectification Bias
Robert Wolfe
Yiwei Yang
Billy Howe
Aylin Caliskan
DiffM
133
57
0
21 Dec 2022
Improving Zero-Shot Models with Label Distribution Priors
Improving Zero-Shot Models with Label Distribution Priors
Jonathan Kahana
Niv Cohen
Yedid Hoshen
VLM
138
14
0
01 Dec 2022
Zero-shot Image Captioning by Anchor-augmented Vision-Language Space
  Alignment
Zero-shot Image Captioning by Anchor-augmented Vision-Language Space Alignment
Junyan Wang
Yi Zhang
Ming Yan
Ji Zhang
Jitao Sang
VLM
64
9
0
14 Nov 2022
Evaluating and Improving Factuality in Multimodal Abstractive
  Summarization
Evaluating and Improving Factuality in Multimodal Abstractive Summarization
David Wan
Joey Tianyi Zhou
93
10
0
04 Nov 2022
Masked Vision-Language Transformer in Fashion
Masked Vision-Language Transformer in Fashion
Ge-Peng Ji
Mingchen Zhuge
D. Gao
Deng-Ping Fan
Daniel Gehrig
Luc Van Gool
90
25
0
27 Oct 2022
FairCLIP: Social Bias Elimination based on Attribute Prototype Learning
  and Representation Neutralization
FairCLIP: Social Bias Elimination based on Attribute Prototype Learning and Representation Neutralization
Junyan Wang
Yi Zhang
Jitao Sang
FaMLVLM
102
24
0
26 Oct 2022
Neural Eigenfunctions Are Structured Representation Learners
Neural Eigenfunctions Are Structured Representation Learners
Zhijie Deng
Jiaxin Shi
Hao Zhang
Peng Cui
Cewu Lu
Jun Zhu
109
14
0
23 Oct 2022
Self-supervised debiasing using low rank regularization
Self-supervised debiasing using low rank regularization
Geon Yeong Park
Chanyong Jung
Sangmin Lee
Jong Chul Ye
Sang Wan Lee
CMLSSL
106
4
0
11 Oct 2022
American == White in Multimodal Language-and-Image AI
American == White in Multimodal Language-and-Image AI
Robert Wolfe
Aylin Caliskan
VLM
90
51
0
01 Jul 2022
Know your audience: specializing grounded language models with listener
  subtraction
Know your audience: specializing grounded language models with listener subtraction
Aaditya K. Singh
David Ding
Andrew M. Saxe
Felix Hill
Andrew Kyle Lampinen
65
2
0
16 Jun 2022
DORA: Exploring Outlier Representations in Deep Neural Networks
DORA: Exploring Outlier Representations in Deep Neural Networks
Kirill Bykov
Mayukh Deb
Dennis Grinwald
Klaus-Robert Muller
Marina M.-C. Höhne
123
13
0
09 Jun 2022
Markedness in Visual Semantic AI
Markedness in Visual Semantic AI
Robert Wolfe
Aylin Caliskan
VLM
112
36
0
23 May 2022
Evidence for Hypodescent in Visual Semantic AI
Evidence for Hypodescent in Visual Semantic AI
Robert Wolfe
M. Banaji
Aylin Caliskan
VLM
96
38
0
22 May 2022
A CLIP-Hitchhiker's Guide to Long Video Retrieval
A CLIP-Hitchhiker's Guide to Long Video Retrieval
Max Bain
Arsha Nagrani
Gül Varol
Andrew Zisserman
CLIP
191
62
0
17 May 2022
ReCLIP: A Strong Zero-Shot Baseline for Referring Expression
  Comprehension
ReCLIP: A Strong Zero-Shot Baseline for Referring Expression Comprehension
Sanjay Subramanian
William Merrill
Trevor Darrell
Matt Gardner
Sameer Singh
Anna Rohrbach
ObjD
114
128
0
12 Apr 2022
A Prompt Array Keeps the Bias Away: Debiasing Vision-Language Models
  with Adversarial Learning
A Prompt Array Keeps the Bias Away: Debiasing Vision-Language Models with Adversarial Learning
Hugo Elias Berg
Elizaveta Semenova
Yash Bhalgat
Wonsuk Yang
Hannah Rose Kirk
Aleksandar Shtedritski
Max Bain
VLM
108
101
0
22 Mar 2022
Contrastive Visual Semantic Pretraining Magnifies the Semantics of
  Natural Language Representations
Contrastive Visual Semantic Pretraining Magnifies the Semantics of Natural Language Representations
Robert Wolfe
Aylin Caliskan
VLM
67
14
0
14 Mar 2022
Scaling Open-Vocabulary Image Segmentation with Image-Level Labels
Scaling Open-Vocabulary Image Segmentation with Image-Level Labels
Golnaz Ghiasi
Xiuye Gu
Huayu Chen
Nayeon Lee
VLM
188
387
0
22 Dec 2021
Twitter-COMMs: Detecting Climate, COVID, and Military Multimodal
  Misinformation
Twitter-COMMs: Detecting Climate, COVID, and Military Multimodal Misinformation
Giscard Biamby
Grace Luo
Trevor Darrell
Anna Rohrbach
68
26
0
16 Dec 2021
Text2Mesh: Text-Driven Neural Stylization for Meshes
Text2Mesh: Text-Driven Neural Stylization for Meshes
O. Michel
Roi Bar-On
Richard Liu
Sagie Benaim
Rana Hanocka
CLIPAI4CE
280
362
0
06 Dec 2021
CLOOB: Modern Hopfield Networks with InfoLOOB Outperform CLIP
CLOOB: Modern Hopfield Networks with InfoLOOB Outperform CLIP
Andreas Fürst
Elisabeth Rumetshofer
Johannes Lehner
Viet-Hung Tran
Fei Tang
...
David P. Kreil
Michael K Kopp
Günter Klambauer
Angela Bitto-Nemling
Sepp Hochreiter
VLMCLIP
314
104
0
21 Oct 2021
CLIPScore: A Reference-free Evaluation Metric for Image Captioning
CLIPScore: A Reference-free Evaluation Metric for Image Captioning
Jack Hessel
Ari Holtzman
Maxwell Forbes
Ronan Le Bras
Yejin Choi
CLIP
269
1,597
0
18 Apr 2021
Previous
12