Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2402.08360
Cited By
Visual Question Answering Instruction: Unlocking Multimodal Large Language Model To Domain-Specific Visual Multitasks
13 February 2024
Jusung Lee
Sungguk Cha
Younghyun Lee
Cheoljong Yang
MLLM
LRM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Visual Question Answering Instruction: Unlocking Multimodal Large Language Model To Domain-Specific Visual Multitasks"
3 / 3 papers shown
Title
OThink-MR1: Stimulating multimodal generalized reasoning capabilities via dynamic reinforcement learning
Zhiyuan Liu
Yuting Zhang
Feng Liu
Changwang Zhang
Ying Sun
Jun Wang
LRM
66
2
0
20 Mar 2025
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
244
4,186
0
30 Jan 2023
LAVIS: A Library for Language-Vision Intelligence
Dongxu Li
Junnan Li
Hung Le
Guangsen Wang
Silvio Savarese
S. Hoi
VLM
90
51
0
15 Sep 2022
1