ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2402.08360
  4. Cited By
Visual Question Answering Instruction: Unlocking Multimodal Large
  Language Model To Domain-Specific Visual Multitasks

Visual Question Answering Instruction: Unlocking Multimodal Large Language Model To Domain-Specific Visual Multitasks

13 February 2024
Jusung Lee
Sungguk Cha
Younghyun Lee
Cheoljong Yang
    MLLM
    LRM
ArXivPDFHTML

Papers citing "Visual Question Answering Instruction: Unlocking Multimodal Large Language Model To Domain-Specific Visual Multitasks"

3 / 3 papers shown
Title
OThink-MR1: Stimulating multimodal generalized reasoning capabilities via dynamic reinforcement learning
OThink-MR1: Stimulating multimodal generalized reasoning capabilities via dynamic reinforcement learning
Zhiyuan Liu
Yuting Zhang
Feng Liu
Changwang Zhang
Ying Sun
Jun Wang
LRM
66
2
0
20 Mar 2025
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image
  Encoders and Large Language Models
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
244
4,186
0
30 Jan 2023
LAVIS: A Library for Language-Vision Intelligence
LAVIS: A Library for Language-Vision Intelligence
Dongxu Li
Junnan Li
Hung Le
Guangsen Wang
Silvio Savarese
S. Hoi
VLM
90
51
0
15 Sep 2022
1