Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2308.08544
Cited By
MeViS: A Large-scale Benchmark for Video Segmentation with Motion Expressions
16 August 2023
Henghui Ding
Chang Liu
Shuting He
Xudong Jiang
Chen Change Loy
VOS
Re-assign community
ArXiv
PDF
HTML
Papers citing
"MeViS: A Large-scale Benchmark for Video Segmentation with Motion Expressions"
50 / 93 papers shown
Title
Learning Streaming Video Representation via Multitask Training
Yibin Yan
Jilan Xu
Shangzhe Di
Yikun Liu
Yudi Shi
Qirui Chen
Zeqian Li
Yifei Huang
Weidi Xie
CLL
76
0
0
28 Apr 2025
Few-Shot Referring Video Single- and Multi-Object Segmentation via Cross-Modal Affinity with Instance Sequence Matching
Heng Liu
Guanghui Li
Mingqi Gao
Xiantong Zhen
Feng Zheng
Y. Wang
VOS
35
0
0
18 Apr 2025
PVUW 2025 Challenge Report: Advances in Pixel-level Understanding of Complex Videos in the Wild
Henghui Ding
Chang Liu
Nikhila Ravi
Shuting He
Y. Wei
...
Haobo Yuan
X. Li
Tao Zhang
Lu Qi
Ming Yang
21
0
0
15 Apr 2025
MASSeg : 2nd Technical Report for 4th PVUW MOSE Track
Xuqiang Cao
Linnan Zhao
Jiaxuan Zhao
Fang Liu
Puhua Chen
Wenping Ma
30
0
0
14 Apr 2025
FVOS for MOSE Track of 4th PVUW Challenge: 3rd Place Solution
Mengjiao Wang
Junpei Zhang
Xu Liu
Yuting Yang
Mengru Ma
VOS
45
0
0
13 Apr 2025
STSeg-Complex Video Object Segmentation: The 1st Solution for 4th PVUW MOSE Challenge
Kehuan Song
Xinglin Xie
Kexin Zhang
Licheng Jiao
Lingling Li
S. M. I. Simon X. Yang
VOS
40
0
0
11 Apr 2025
The 1st Solution for 4th PVUW MeViS Challenge: Unleashing the Potential of Large Multimodal Models for Referring Video Segmentation
Hao Fang
Runmin Cong
Xiankai Lu
Z. Chen
Wei Zhang
29
0
0
07 Apr 2025
REEF: Relevance-Aware and Efficient LLM Adapter for Video Understanding
Sakib Reza
Xiyun Song
Heather Yu
Zongfang Lin
Mohsen Moghaddam
Octavia Camps
20
0
0
07 Apr 2025
4th PVUW MeViS 3rd Place Report: Sa2VA
Haobo Yuan
Tao Zhang
X. Li
Lu Qi
Zilong Huang
Shilin Xu
Jiashi Feng
Ming Yang
33
1
0
01 Apr 2025
ReferDINO-Plus: 2nd Solution for 4th PVUW MeViS Challenge at CVPR 2025
Tianming Liang
Haichao Jiang
Wei-Shi Zheng
Jian-Fang Hu
29
0
0
30 Mar 2025
Exploiting Temporal State Space Sharing for Video Semantic Segmentation
Syed Ariff Syed Hesham
Yun Liu
Guolei Sun
Henghui Ding
Jing Yang
Ender Konukoglu
Xue Geng
Xudong Jiang
40
1
0
26 Mar 2025
TransAnimate: Taming Layer Diffusion to Generate RGBA Video
Xuewei Chen
Zhimin Chen
Yiren Song
VGen
61
0
0
23 Mar 2025
InstructVEdit: A Holistic Approach for Instructional Video Editing
Chi Zhang
C. Feng
Feng Yan
Qiming Zhang
Mingjin Zhang
Yujie Zhong
Jing Zhang
Lin Ma
DiffM
VGen
36
0
0
22 Mar 2025
V2P-Bench: Evaluating Video-Language Understanding with Visual Prompts for Better Human-Model Interaction
Yiming Zhao
Y. Zeng
Yukun Qi
Y. Liu
Lin Yen-Chen
Zehui Chen
Xikun Bao
Jie Zhao
Feng Zhao
VLM
53
2
0
22 Mar 2025
QuartDepth: Post-Training Quantization for Real-Time Depth Estimation on the Edge
Xuan Shen
Weize Ma
Jing Liu
Changdi Yang
Rui Ding
...
Wei Niu
Yanzhi Wang
Pu Zhao
Jun Lin
Jiuxiang Gu
MQ
47
0
0
20 Mar 2025
MagicMotion: Controllable Video Generation with Dense-to-Sparse Trajectory Guidance
Quanhao Li
Zhen Xing
Rui Wang
Hui Zhang
Qi Dai
Zuxuan Wu
VGen
61
0
0
20 Mar 2025
AUTV: Creating Underwater Video Datasets with Pixel-wise Annotations
Quang-Trung Truong
Wong Yuk Kwan
Duc Thanh Nguyen
Binh-Son Hua
Sai-Kit Yeung
VGen
48
0
0
17 Mar 2025
SAM2 for Image and Video Segmentation: A Comprehensive Survey
Zhang Jiaxing
Tang Hao
VLM
50
0
0
17 Mar 2025
Dynamic Derivation and Elimination: Audio Visual Segmentation with Enhanced Audio Semantics
Chen Liu
Liying Yang
Peike Li
Dadong Wang
Lincheng Li
Xin Yu
VOS
94
0
0
17 Mar 2025
Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey
Y. Wang
Shengqiong Wu
Y. Zhang
William Yang Wang
Ziwei Liu
Jiebo Luo
Hao Fei
LRM
74
7
0
16 Mar 2025
CPAny: Couple With Any Encoder to Refer Multi-Object Tracking
Weize Li
Yunhao Du
Qixiang Yin
Zhicheng Zhao
Fei Su
Daqi Liu
54
0
0
10 Mar 2025
Find First, Track Next: Decoupling Identification and Propagation in Referring Video Object Segmentation
Suhwan Cho
Seunghoon Lee
Minhyeok Lee
Jungho Lee
Sangyoun Lee
VOS
77
0
0
05 Mar 2025
IteRPrimE: Zero-shot Referring Image Segmentation with Iterative Grad-CAM Refinement and Primary Word Emphasis
Y. Wang
Jingchen Ni
Yong-Jin Liu
Chun Yuan
Yansong Tang
34
1
0
02 Mar 2025
Unhackable Temporal Rewarding for Scalable Video MLLMs
En Yu
Kangheng Lin
Liang Zhao
Yana Wei
Zining Zhu
...
Jianjian Sun
Zheng Ge
X. Zhang
Jingyu Wang
Wenbing Tao
52
4
0
17 Feb 2025
MPG-SAM 2: Adapting SAM 2 with Mask Priors and Global Context for Referring Video Object Segmentation
Fu Rong
Meng Lan
Q. Zhang
L. Zhang
VOS
VGen
65
1
0
23 Jan 2025
InternVideo2.5: Empowering Video MLLMs with Long and Rich Context Modeling
Yi Wang
Xinhao Li
Ziang Yan
Yinan He
Jiashuo Yu
...
Kai Chen
Wenhai Wang
Yu Qiao
Yali Wang
Limin Wang
64
19
0
21 Jan 2025
The Devil is in Temporal Token: High Quality Video Reasoning Segmentation
Sitong Gong
Yunzhi Zhuge
Lu Zhang
Z. Yang
Pingping Zhang
Huchuan Lu
34
0
0
15 Jan 2025
Omni-RGPT: Unifying Image and Video Region-level Understanding via Token Marks
Miran Heo
Min-Hung Chen
De-An Huang
Sifei Liu
Subhashree Radhakrishnan
Seon Joo Kim
Yu-Chun Wang
Ryo Hachiuma
ObjD
VLM
108
2
0
14 Jan 2025
Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos
Haobo Yuan
X. Li
Tao Zhang
Zilong Huang
Shilin Xu
S. Ji
Yunhai Tong
Lu Qi
Jiashi Feng
Ming Yang
VLM
79
11
0
07 Jan 2025
Hierarchical Alignment-enhanced Adaptive Grounding Network for Generalized Referring Expression Comprehension
Yaxian Wang
Henghui Ding
Shuting He
Xudong Jiang
Bifan Wei
Jun Liu
ObjD
30
1
0
03 Jan 2025
VideoRefer Suite: Advancing Spatial-Temporal Object Understanding with Video LLM
Yuqian Yuan
Hang Zhang
Wentong Li
Zesen Cheng
Boqiang Zhang
...
Deli Zhao
Wenqiao Zhang
Yueting Zhuang
Jianke Zhu
Lidong Bing
58
5
0
31 Dec 2024
InstructSeg: Unifying Instructed Visual Segmentation with Multi-modal Large Language Models
Cong Wei
Yujie Zhong
Haoxian Tan
Yingsen Zeng
Y. Liu
Zheng Zhao
Yujiu Yang
MLLM
VLM
VOS
85
2
0
18 Dec 2024
M
3
^3
3
-VOS: Multi-Phase, Multi-Transition, and Multi-Scenery Video Object Segmentation
Zixuan Chen
Jiaxin Li
Liming Tan
Yejie Guo
Junxuan Liang
Cewu Lu
Y. Li
VOS
63
0
0
18 Dec 2024
Referring Video Object Segmentation via Language-aligned Track Selection
Seongchan Kim
Woojeong Jin
Sangbeom Lim
Heeji Yoon
Hyunwook Choi
Seungryong Kim
VOS
84
0
0
02 Dec 2024
HyperSeg: Towards Universal Visual Segmentation with Large Language Model
Cong Wei
Yujie Zhong
Haoxian Tan
Y. Liu
Zheng Zhao
Jie Hu
Yujiu Yang
VOS
MLLM
VLM
LRM
84
1
0
26 Nov 2024
SAMWISE: Infusing Wisdom in SAM2 for Text-Driven Video Segmentation
Claudia Cuttano
Gabriele Trivigno
Gabriele Rosi
Carlo Masone
Giuseppe Averta
VOS
94
1
0
26 Nov 2024
Motion-Grounded Video Reasoning: Understanding and Perceiving Motion at Pixel Level
Andong Deng
Tongjia Chen
Shoubin Yu
Taojiannan Yang
Lincoln Spencer
Yapeng Tian
Ajmal Saeed Mian
Mohit Bansal
Chen Chen
LRM
46
1
0
15 Nov 2024
VideoGLaMM: A Large Multimodal Model for Pixel-Level Visual Grounding in Videos
Shehan Munasinghe
Hanan Gani
Wenqi Zhu
Jiale Cao
Eric P. Xing
F. Khan
Salman Khan
MLLM
VGen
VLM
42
6
0
07 Nov 2024
Unleashing the Potential of the Diffusion Model in Few-shot Semantic Segmentation
Muzhi Zhu
Yang Liu
Zekai Luo
Chenchen Jing
Hao Chen
Guangkai Xu
Xinlong Wang
Chunhua Shen
DiffM
VLM
29
2
0
03 Oct 2024
ChatVTG: Video Temporal Grounding via Chat with Video Dialogue Large Language Models
Mengxue Qu
Xiaodong Chen
Wu Liu
Alicia Li
Yao Zhao
37
13
0
01 Oct 2024
One Token to Seg Them All: Language Instructed Reasoning Segmentation in Videos
Zechen Bai
Tong He
Haiyang Mei
Pichao Wang
Ziteng Gao
Joya Chen
Lei Liu
Zheng Zhang
Mike Zheng Shou
VLM
VOS
MLLM
32
17
0
29 Sep 2024
LSVOS Challenge Report: Large-scale Complex and Long Video Object Segmentation
Henghui Ding
Lingyi Hong
Chang Liu
Ning Xu
L. Yang
...
Bin Cao
Yisi Zhang
Hanyi Wang
Xingjian He
Jing Liu
VOS
19
2
0
09 Sep 2024
Discriminative Spatial-Semantic VOS Solution: 1st Place Solution for 6th LSVOS
Deshui Miao
Yameng Gu
Xin Li
Zhenyu He
Yaowei Wang
Ming-Hsuan Yang
19
0
0
29 Aug 2024
Unleashing the Temporal-Spatial Reasoning Capacity of GPT for Training-Free Audio and Language Referenced Video Object Segmentation
Shaofei Huang
Rui Ling
Hongyu Li
Tianrui Hui
Zongheng Tang
Xiaoming Wei
Jizhong Han
Si Liu
VOS
13
4
0
28 Aug 2024
CSS-Segment: 2nd Place Report of LSVOS Challenge VOS Track
Jinming Chai
Qin Ma
Junpei Zhang
Licheng Jiao
Fang Liu
VOS
23
0
0
24 Aug 2024
The 2nd Solution for LSVOS Challenge RVOS Track: Spatial-temporal Refinement for Consistent Semantic Segmentation
Tuyen Tran
23
2
0
22 Aug 2024
LSVOS Challenge 3rd Place Report: SAM2 and Cutie based VOS
Xinyu Liu
Jing Zhang
Kexin Zhang
Xu Liu
Lingling Li
11
1
0
20 Aug 2024
UNINEXT-Cutie: The 1st Solution for LSVOS Challenge RVOS Track
Hao Fang
Feiyu Pan
Xiankai Lu
Wei Zhang
Runmin Cong
21
3
0
19 Aug 2024
Video Object Segmentation via SAM 2: The 4th Solution for LSVOS Challenge VOS Track
Feiyu Pan
Hao Fang
Runmin Cong
Wei Zhang
Xiankai Lu
VOS
25
3
0
19 Aug 2024
Language-Driven Interactive Shadow Detection
Hongqiu Wang
Wei Wang
Haipeng Zhou
Huihui Xu
Shaozhi Wu
Lei Zhu
16
6
0
16 Aug 2024
1
2
Next