ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2310.16033
  4. Cited By
Towards Perceiving Small Visual Details in Zero-shot Visual Question
  Answering with Multimodal LLMs

Towards Perceiving Small Visual Details in Zero-shot Visual Question Answering with Multimodal LLMs

24 October 2023
Jiarui Zhang
Mahyar Khayatkhoei
P. Chhikara
Filip Ilievski
ArXivPDFHTML

Papers citing "Towards Perceiving Small Visual Details in Zero-shot Visual Question Answering with Multimodal LLMs"

5 / 5 papers shown
Title
Scaling 3D Reasoning with LMMs to Large Robot Mission Environments Using
  Datagraphs
Scaling 3D Reasoning with LMMs to Large Robot Mission Environments Using Datagraphs
W. J. Meijer
A. C. Kemmeren
E.H.J. Riemens
J. E. Fransman
M. V. Bekkum
G. J. Burghouts
J. D. V. Mil
31
0
0
15 Jul 2024
V*: Guided Visual Search as a Core Mechanism in Multimodal LLMs
V*: Guided Visual Search as a Core Mechanism in Multimodal LLMs
Penghao Wu
Saining Xie
LRM
49
122
0
21 Dec 2023
BLIP: Bootstrapping Language-Image Pre-training for Unified
  Vision-Language Understanding and Generation
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
S. Hoi
MLLM
BDL
VLM
CLIP
388
4,110
0
28 Jan 2022
Unifying Vision-and-Language Tasks via Text Generation
Unifying Vision-and-Language Tasks via Text Generation
Jaemin Cho
Jie Lei
Hao Tan
Mohit Bansal
MLLM
249
525
0
04 Feb 2021
RSVQA: Visual Question Answering for Remote Sensing Data
RSVQA: Visual Question Answering for Remote Sensing Data
Sylvain Lobry
Diego Marcos
J. Murray
D. Tuia
62
203
0
16 Mar 2020
1