Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2408.06610
Cited By
CROME: Cross-Modal Adapters for Efficient Multimodal LLM
13 August 2024
Sayna Ebrahimi
Sercan Ö. Arik
Tejas Nama
Tomas Pfister
Re-assign community
ArXiv
PDF
HTML
Papers citing
"CROME: Cross-Modal Adapters for Efficient Multimodal LLM"
4 / 4 papers shown
Title
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
244
4,186
0
30 Jan 2023
Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering
Pan Lu
Swaroop Mishra
Tony Xia
Liang Qiu
Kai-Wei Chang
Song-Chun Zhu
Oyvind Tafjord
Peter Clark
A. Kalyan
ELM
ReLM
LRM
198
1,089
0
20 Sep 2022
AdaptFormer: Adapting Vision Transformers for Scalable Visual Recognition
Shoufa Chen
Chongjian Ge
Zhan Tong
Jiangliu Wang
Yibing Song
Jue Wang
Ping Luo
138
631
0
26 May 2022
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
S. Hoi
MLLM
BDL
VLM
CLIP
380
4,010
0
28 Jan 2022
1