Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2305.17369
Cited By
Modularized Zero-shot VQA with Pre-trained Models
27 May 2023
Rui Cao
Jing Jiang
LRM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Modularized Zero-shot VQA with Pre-trained Models"
6 / 6 papers shown
Title
Task-Agnostic Attacks Against Vision Foundation Models
Brian Pulfer
Yury Belousov
Vitaliy Kinakh
Teddy Furon
S. Voloshynovskiy
AAML
68
0
0
05 Mar 2025
VSA4VQA: Scaling a Vector Symbolic Architecture to Visual Question Answering on Natural Images
Anna Penzkofer
Lei Shi
Andreas Bulling
23
0
0
06 May 2024
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
S. Hoi
MLLM
BDL
VLM
CLIP
382
4,010
0
28 Jan 2022
An Empirical Study of GPT-3 for Few-Shot Knowledge-Based VQA
Zhengyuan Yang
Zhe Gan
Jianfeng Wang
Xiaowei Hu
Yumao Lu
Zicheng Liu
Lijuan Wang
169
401
0
10 Sep 2021
How Much Can CLIP Benefit Vision-and-Language Tasks?
Sheng Shen
Liunian Harold Li
Hao Tan
Mohit Bansal
Anna Rohrbach
Kai-Wei Chang
Z. Yao
Kurt Keutzer
CLIP
VLM
MLLM
185
403
0
13 Jul 2021
Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding
Akira Fukui
Dong Huk Park
Daylen Yang
Anna Rohrbach
Trevor Darrell
Marcus Rohrbach
144
1,458
0
06 Jun 2016
1