Benchmarking Zero-Shot Recognition with Vision-Language Models: Challenges on Granularity and Specificity

28 June 2023

Papers citing "Benchmarking Zero-Shot Recognition with Vision-Language Models: Challenges on Granularity and Specificity"

8 / 8 papers shown

Title
Enhancing Collective Intelligence in Large Language Models Through Emotional Integration Likith Kadiyala Ramteja Sajja Y. Sermet Ibrahim Demir 108 0 0 05 Mar 2025
Robustness Analysis of Video-Language Models Against Visual and Language Perturbations Madeline Chantry Schiappa Shruti Vyas Hamid Palangi Y. S. Rawat Vibhav Vineet VLM 114 17 0 05 Jul 2022
Discovering the Hidden Vocabulary of DALLE-2 Giannis Daras A. Dimakis 122 64 0 01 Jun 2022
GroupViT: Semantic Segmentation Emerges from Text Supervision Jiarui Xu Shalini De Mello Sifei Liu Wonmin Byeon Thomas Breuel Jan Kautz X. Wang ViT VLM 180 499 0 22 Feb 2022
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation Junnan Li Dongxu Li Caiming Xiong S. Hoi MLLM BDL VLM CLIP 390 4,124 0 28 Jan 2022
ActionCLIP: A New Paradigm for Video Action Recognition Mengmeng Wang Jiazheng Xing Yong Liu VLM 149 362 0 17 Sep 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision Chao Jia Yinfei Yang Ye Xia Yi-Ting Chen Zarana Parekh Hieu H. Pham Quoc V. Le Yun-hsuan Sung Zhen Li Tom Duerig VLM CLIP 293 3,689 0 11 Feb 2021
ImageNet Large Scale Visual Recognition Challenge Olga Russakovsky Jia Deng Hao Su J. Krause S. Satheesh ... A. Karpathy A. Khosla Michael S. Bernstein Alexander C. Berg Li Fei-Fei VLM ObjD 282 39,190 0 01 Sep 2014