Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2407.14500
Cited By
ViLLa: Video Reasoning Segmentation with Large Language Model
18 July 2024
Rongkun Zheng
Lu Qi
Xi Chen
Yi Wang
Kun Wang
Yu Qiao
Hengshuang Zhao
VOS
LRM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"ViLLa: Video Reasoning Segmentation with Large Language Model"
8 / 8 papers shown
Title
The Devil is in Temporal Token: High Quality Video Reasoning Segmentation
Sitong Gong
Yunzhi Zhuge
Lu Zhang
Z. Yang
Pingping Zhang
Huchuan Lu
29
0
0
15 Jan 2025
Motion-Grounded Video Reasoning: Understanding and Perceiving Motion at Pixel Level
Andong Deng
Tongjia Chen
Shoubin Yu
Taojiannan Yang
Lincoln Spencer
Yapeng Tian
Ajmal Saeed Mian
Mohit Bansal
Chen Chen
LRM
44
1
0
15 Nov 2024
DVIS-DAQ: Improving Video Segmentation via Dynamic Anchor Queries
Yikang Zhou
Tao Zhang
Shunping Ji
Shuicheng Yan
Xiangtai Li
21
5
0
29 Mar 2024
OMG-Seg: Is One Model Good Enough For All Segmentation?
Xiangtai Li
Haobo Yuan
Wei Li
Henghui Ding
Size Wu
Wenwei Zhang
Yining Li
Kai Chen
Chen Change Loy
VLM
MLLM
ViT
59
48
0
18 Jan 2024
mPLUG-Owl: Modularization Empowers Large Language Models with Multimodality
Qinghao Ye
Haiyang Xu
Guohai Xu
Jiabo Ye
Ming Yan
...
Junfeng Tian
Qiang Qi
Ji Zhang
Feiyan Huang
Jingren Zhou
VLM
MLLM
198
883
0
27 Apr 2023
Instruction Tuning with GPT-4
Baolin Peng
Chunyuan Li
Pengcheng He
Michel Galley
Jianfeng Gao
SyDa
ALM
LM&MA
154
576
0
06 Apr 2023
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
244
4,186
0
30 Jan 2023
STEm-Seg: Spatio-temporal Embeddings for Instance Segmentation in Videos
A. Athar
Sabarinath Mahadevan
Aljosa Osep
Laura Leal-Taixé
Bastian Leibe
VOS
70
169
0
18 Mar 2020
1