Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2405.20795
Cited By
InsightSee: Advancing Multi-agent Vision-Language Models for Enhanced Visual Understanding
31 May 2024
Huaxiang Zhang
Yaojia Mu
Guo-Niu Zhu
Zhongxue Gan
Re-assign community
ArXiv
PDF
HTML
Papers citing
"InsightSee: Advancing Multi-agent Vision-Language Models for Enhanced Visual Understanding"
5 / 5 papers shown
Title
OWLViz: An Open-World Benchmark for Visual Question Answering
T. Nguyen
Dang Nguyen
Hoang Nguyen
Thuan Luong
Long Hoang Dang
Viet Dac Lai
VLM
61
0
0
04 Mar 2025
VLM-Social-Nav: Socially Aware Robot Navigation through Scoring using Vision-Language Models
Daeun Song
Jing Liang
Amirreza Payandeh
Xuesu Xiao
Dinesh Manocha
21
13
0
30 Mar 2024
InternVL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks
Zhe Chen
Jiannan Wu
Wenhai Wang
Weijie Su
Guo Chen
...
Bin Li
Ping Luo
Tong Lu
Yu Qiao
Jifeng Dai
VLM
MLLM
156
895
0
21 Dec 2023
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
244
4,186
0
30 Jan 2023
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
315
8,261
0
28 Jan 2022
1