Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2406.16851
Cited By
Losing Visual Needles in Image Haystacks: Vision Language Models are Easily Distracted in Short and Long Contexts
24 June 2024
Aditya Sharma
Michael Saxon
William Yang Wang
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Losing Visual Needles in Image Haystacks: Vision Language Models are Easily Distracted in Short and Long Contexts"
5 / 5 papers shown
Title
MileBench: Benchmarking MLLMs in Long Context
Dingjie Song
Shunian Chen
Guiming Hardy Chen
Fei Yu
Xiang Wan
Benyou Wang
VLM
76
34
0
29 Apr 2024
Multi-Modal Hallucination Control by Visual Information Grounding
Alessandro Favero
L. Zancato
Matthew Trager
Siddharth Choudhary
Pramuditha Perera
Alessandro Achille
Ashwin Swaminathan
Stefano Soatto
MLLM
85
62
0
20 Mar 2024
Lost in Translation? Translation Errors and Challenges for Fair Assessment of Text-to-Image Models on Multilingual Concepts
Michael Stephen Saxon
Yiran Luo
Sharon Levy
Chitta Baral
Yezhou Yang
William Yang Wang
EGVM
25
3
0
17 Mar 2024
CogAgent: A Visual Language Model for GUI Agents
Wenyi Hong
Weihan Wang
Qingsong Lv
Jiazheng Xu
Wenmeng Yu
...
Juanzi Li
Bin Xu
Yuxiao Dong
Ming Ding
Jie Tang
MLLM
137
319
0
14 Dec 2023
Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering
Pan Lu
Swaroop Mishra
Tony Xia
Liang Qiu
Kai-Wei Chang
Song-Chun Zhu
Oyvind Tafjord
Peter Clark
A. Kalyan
ELM
ReLM
LRM
207
1,101
0
20 Sep 2022
1