Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2406.05318
Cited By
Integrating Text and Image Pre-training for Multi-modal Algorithmic Reasoning
8 June 2024
Zijian Zhang
Wei Liu
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Integrating Text and Image Pre-training for Multi-modal Algorithmic Reasoning"
2 / 2 papers shown
Title
MiniGPT-v2: large language model as a unified interface for vision-language multi-task learning
Jun Chen
Deyao Zhu
Xiaoqian Shen
Xiang Li
Zechun Liu
Pengchuan Zhang
Raghuraman Krishnamoorthi
Vikas Chandra
Yunyang Xiong
Mohamed Elhoseiny
MLLM
154
280
0
14 Oct 2023
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
244
4,186
0
30 Jan 2023
1