Evaluating CLIP: Towards Characterization of Broader Capabilities and Downstream Implications

5 August 2021

Papers citing "Evaluating CLIP: Towards Characterization of Broader Capabilities and Downstream Implications"

45 / 95 papers shown

Title
Composed Image Retrieval using Contrastive Learning and Task-oriented CLIP-based Features Alberto Baldrati Marco Bertini Tiberio Uricchio A. Bimbo CLIP CoGe 69 35 0 22 Aug 2023
PerceptionCLIP: Visual Classification by Inferring and Conditioning on Contexts Bang An Sicheng Zhu Michael-Andrei Panaitescu-Liess Chaithanya Kumar Mummadi Furong Huang VLM 95 8 0 02 Aug 2023
Distilling Knowledge from Text-to-Image Generative Models Improves Visio-Linguistic Reasoning in CLIP S. Basu S. Hu Maziar Sanjabi Daniela Massiceti Soheil Feizi VLM 46 4 0 18 Jul 2023
What a MESS: Multi-Domain Evaluation of Zero-Shot Semantic Segmentation Benedikt Blumenstiel Johannes Jakubik Hilde Kuhne Michael Vossing VLM 129 18 0 27 Jun 2023
VisoGender: A dataset for benchmarking gender bias in image-text pronoun resolution Elizaveta Semenova F. G. Abrantes Hanwen Zhu Grace A. Sodunke Aleksandar Shtedritski Hannah Rose Kirk CoGe 125 46 0 21 Jun 2023
RS5M and GeoRSCLIP: A Large Scale Vision-Language Dataset and A Large Vision-Language Model for Remote Sensing Zilun Zhang Tiancheng Zhao Yulong Guo Yuxiang Cai DiffM VLM 159 66 0 20 Jun 2023
Modular Visual Question Answering via Code Generation Sanjay Subramanian Medhini Narasimhan Kushal Khangaonkar Kevin Kaichuang Yang Arsha Nagrani Cordelia Schmid Andy Zeng Trevor Darrell Dan Klein 77 51 0 08 Jun 2023
Semantically-Prompted Language Models Improve Visual Descriptions Michael Ogezi B. Hauer Grzegorz Kondrak VLM 61 0 0 05 Jun 2023
Balancing the Picture: Debiasing Vision-Language Datasets with Synthetic Contrast Sets Brandon Smith Miguel Farinha Elizaveta Semenova Hannah Rose Kirk Aleksandar Shtedritski Max Bain 90 19 0 24 May 2023
Gender Biases in Automatic Evaluation Metrics for Image Captioning Haoyi Qiu Zi-Yi Dou Tianlu Wang Asli Celikyilmaz Nanyun Peng EGVM 119 16 0 24 May 2023
Learning the Visualness of Text Using Large Vision-Language Models Gaurav Verma Ryan Rossi Chris Tensmeyer Jiuxiang Gu A. Nenkova VLM 71 0 0 11 May 2023
What does CLIP know about a red circle? Visual prompt engineering for VLMs Aleksandar Shtedritski Christian Rupprecht Andrea Vedaldi VLM MLLM 113 162 0 13 Apr 2023
Mitigating Spurious Correlations in Multi-modal Models during Fine-tuning Yu Yang Besmira Nushi Hamid Palangi Baharan Mirzasoleiman 103 39 0 08 Apr 2023
Self-Supervised Multimodal Learning: A Survey Yongshuo Zong Oisin Mac Aodha Timothy M. Hospedales SSL 125 50 0 31 Mar 2023
MultiModal Bias: Introducing a Framework for Stereotypical Bias Assessment beyond Gender and Race in Vision Language Models Sepehr Janghorbani Gerard de Melo VLM 106 12 0 16 Mar 2023
Automatic Geo-alignment of Artwork in Children's Story Books Jakub J Dylag V. Suarez James Wald Aneesha Amodini Uvara DiffM 67 0 0 16 Mar 2023
Text2Face: A Multi-Modal 3D Face Model William Rowan P. Huber Nick E. Pears Andrew Keeling 3DH 59 3 0 05 Mar 2023
Towards Reliable Assessments of Demographic Disparities in Multi-Label Image Classifiers Melissa Hall Bobbie Chern Laura Gustafson Denisse Ventura Harshad Kulkarni Candace Ross Nicolas Usunier 71 6 0 16 Feb 2023
A Friendly Face: Do Text-to-Image Systems Rely on Stereotypes when the Input is Under-Specified? Kathleen C. Fraser S. Kiritchenko I. Nejadgholi DiffM 80 38 0 14 Feb 2023
Auditing Gender Presentation Differences in Text-to-Image Models Yanzhe Zhang Lu Jiang Greg Turk Diyi Yang EGVM 90 24 0 07 Feb 2023
Debiasing Vision-Language Models via Biased Prompts Ching-Yao Chuang Varun Jampani Yuanzhen Li Antonio Torralba Stefanie Jegelka VLM 122 107 0 31 Jan 2023
Discovering and Mitigating Visual Biases through Keyword Explanation Younghyun Kim Sangwoo Mo Minkyu Kim Kyungmin Lee Jaeho Lee Jinwoo Shin 165 34 0 26 Jan 2023
Vision-Language Models Performing Zero-Shot Tasks Exhibit Gender-based Disparities Melissa Hall Laura Gustafson Aaron B. Adcock Ishan Misra Candace Ross VLM 101 24 0 26 Jan 2023
Contrastive Language-Vision AI Models Pretrained on Web-Scraped Multimodal Data Exhibit Sexual Objectification Bias Robert Wolfe Yiwei Yang Billy Howe Aylin Caliskan DiffM 133 57 0 21 Dec 2022
Improving Zero-Shot Models with Label Distribution Priors Jonathan Kahana Niv Cohen Yedid Hoshen VLM 138 14 0 01 Dec 2022
Zero-shot Image Captioning by Anchor-augmented Vision-Language Space Alignment Junyan Wang Yi Zhang Ming Yan Ji Zhang Jitao Sang VLM 64 9 0 14 Nov 2022
Evaluating and Improving Factuality in Multimodal Abstractive Summarization David Wan Joey Tianyi Zhou 93 10 0 04 Nov 2022
Masked Vision-Language Transformer in Fashion Ge-Peng Ji Mingchen Zhuge D. Gao Deng-Ping Fan Daniel Gehrig Luc Van Gool 90 25 0 27 Oct 2022
FairCLIP: Social Bias Elimination based on Attribute Prototype Learning and Representation Neutralization Junyan Wang Yi Zhang Jitao Sang FaML VLM 102 24 0 26 Oct 2022
Neural Eigenfunctions Are Structured Representation Learners Zhijie Deng Jiaxin Shi Hao Zhang Peng Cui Cewu Lu Jun Zhu 109 14 0 23 Oct 2022
Self-supervised debiasing using low rank regularization Geon Yeong Park Chanyong Jung Sangmin Lee Jong Chul Ye Sang Wan Lee CML SSL 106 4 0 11 Oct 2022
American == White in Multimodal Language-and-Image AI Robert Wolfe Aylin Caliskan VLM 90 51 0 01 Jul 2022
Know your audience: specializing grounded language models with listener subtraction Aaditya K. Singh David Ding Andrew M. Saxe Felix Hill Andrew Kyle Lampinen 65 2 0 16 Jun 2022
DORA: Exploring Outlier Representations in Deep Neural Networks Kirill Bykov Mayukh Deb Dennis Grinwald Klaus-Robert Muller Marina M.-C. Höhne 123 13 0 09 Jun 2022
Markedness in Visual Semantic AI Robert Wolfe Aylin Caliskan VLM 112 36 0 23 May 2022
Evidence for Hypodescent in Visual Semantic AI Robert Wolfe M. Banaji Aylin Caliskan VLM 96 38 0 22 May 2022
A CLIP-Hitchhiker's Guide to Long Video Retrieval Max Bain Arsha Nagrani Gül Varol Andrew Zisserman CLIP 191 62 0 17 May 2022
ReCLIP: A Strong Zero-Shot Baseline for Referring Expression Comprehension Sanjay Subramanian William Merrill Trevor Darrell Matt Gardner Sameer Singh Anna Rohrbach ObjD 114 128 0 12 Apr 2022
A Prompt Array Keeps the Bias Away: Debiasing Vision-Language Models with Adversarial Learning Hugo Elias Berg Elizaveta Semenova Yash Bhalgat Wonsuk Yang Hannah Rose Kirk Aleksandar Shtedritski Max Bain VLM 108 101 0 22 Mar 2022
Contrastive Visual Semantic Pretraining Magnifies the Semantics of Natural Language Representations Robert Wolfe Aylin Caliskan VLM 67 14 0 14 Mar 2022
Scaling Open-Vocabulary Image Segmentation with Image-Level Labels Golnaz Ghiasi Xiuye Gu Huayu Chen Nayeon Lee VLM 188 387 0 22 Dec 2021
Twitter-COMMs: Detecting Climate, COVID, and Military Multimodal Misinformation Giscard Biamby Grace Luo Trevor Darrell Anna Rohrbach 68 26 0 16 Dec 2021
Text2Mesh: Text-Driven Neural Stylization for Meshes O. Michel Roi Bar-On Richard Liu Sagie Benaim Rana Hanocka CLIP AI4CE 280 362 0 06 Dec 2021
CLOOB: Modern Hopfield Networks with InfoLOOB Outperform CLIP Andreas Fürst Elisabeth Rumetshofer Johannes Lehner Viet-Hung Tran Fei Tang ... David P. Kreil Michael K Kopp Günter Klambauer Angela Bitto-Nemling Sepp Hochreiter VLM CLIP 314 104 0 21 Oct 2021
CLIPScore: A Reference-free Evaluation Metric for Image Captioning Jack Hessel Ari Holtzman Maxwell Forbes Ronan Le Bras Yejin Choi CLIP 269 1,597 0 18 Apr 2021