Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2406.18591
Cited By
Composition Vision-Language Understanding via Segment and Depth Anything Model
7 June 2024
Mingxiao Huo
Pengliang Ji
Haotian Lin
Junchen Liu
Yixiao Wang
Yijun Chen
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Composition Vision-Language Understanding via Segment and Depth Anything Model"
4 / 4 papers shown
Title
First-place Solution for Streetscape Shop Sign Recognition Competition
Bin Wang
Li Jing
115
0
0
06 Jan 2025
Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data
Lihe Yang
Bingyi Kang
Zilong Huang
Xiaogang Xu
Jiashi Feng
Hengshuang Zhao
VLM
139
706
0
19 Jan 2024
Video-LLaVA: Learning United Visual Representation by Alignment Before Projection
Bin Lin
Yang Ye
Bin Zhu
Jiaxi Cui
Munan Ning
Peng Jin
Li-ming Yuan
VLM
MLLM
194
587
0
16 Nov 2023
MiniGPT-v2: large language model as a unified interface for vision-language multi-task learning
Jun Chen
Deyao Zhu
Xiaoqian Shen
Xiang Li
Zechun Liu
Pengchuan Zhang
Raghuraman Krishnamoorthi
Vikas Chandra
Yunyang Xiong
Mohamed Elhoseiny
MLLM
160
440
0
14 Oct 2023
1