Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2304.01715
Cited By
v1
v2 (latest)
Towards Open-Vocabulary Video Instance Segmentation
IEEE International Conference on Computer Vision (ICCV), 2023
4 April 2023
Haochen Wang
Cilin Yan
Shuailong Wang
Xiaolong Jiang
XU Tang
Yao Hu
Weidi Xie
E. Gavves
VOS
VLM
Re-assign community
ArXiv (abs)
PDF
HTML
Github (87★)
Papers citing
"Towards Open-Vocabulary Video Instance Segmentation"
33 / 33 papers shown
Title
ReVSeg: Incentivizing the Reasoning Chain for Video Segmentation with Reinforcement Learning
Y. Li
Yingda Yin
Lingting Zhu
Weikai Chen
Shengju Qian
Xin Wang
Yanwei Fu
VOS
LRM
316
0
0
02 Dec 2025
Explicit Memory through Online 3D Gaussian Splatting Improves Class-Agnostic Video Segmentation
IEEE Robotics and Automation Letters (IEEE RA-L), 2025
Anthony Opipari
Aravindhan K. Krishnan
Shreekant Gayaka
Min Sun
Cheng-Hao Kuo
Arnie Sen
Odest Chadwicke Jenkins
VOS
3DGS
293
0
0
27 Oct 2025
MaskCaptioner: Learning to Jointly Segment and Caption Object Trajectories in Videos
Gabriel Fiastre
Antoine Yang
Cordelia Schmid
VOS
393
0
0
16 Oct 2025
Image-to-Video Transfer Learning based on Image-Language Foundation Models: A Comprehensive Survey
Jinxuan Li
Chaolei Tan
Haoxuan Chen
Jianxin Ma
Jian-Fang Hu
Wei-Shi Zheng
Jianhuang Lai
VLM
133
1
0
12 Oct 2025
UniPixel: Unified Object Referring and Segmentation for Pixel-Level Visual Reasoning
Ye Liu
Zongyang Ma
Junfu Pu
Zhongang Qi
Yang Wu
Mingyu Ding
Chang Wen Chen
MLLM
ObjD
LRM
339
2
0
22 Sep 2025
Generalized Decoupled Learning for Enhancing Open-Vocabulary Dense Perception
Junjie Wang
Keyu Chen
Yulin Li
Bin Chen
Hengshuang Zhao
Xiaojuan Qi
Zhuotao Tian
CLIP
VLM
118
1
0
15 Aug 2025
A Comprehensive Survey on Video Scene Parsing:Advances, Challenges, and Prospects
Guohuan Xie
Syed Ariff Syed Hesham
Wenya Guo
Bing Li
Ming-Ming Cheng
Guolei Sun
Yun-Hai Liu
150
1
0
16 Jun 2025
SAM2Auto: Auto Annotation Using FLASH
Arash Rocky
Q.M. Jonathan Wu
VGen
VLM
210
0
0
09 Jun 2025
SAM-I2V: Upgrading SAM to Support Promptable Video Segmentation with Less than 0.2% Training Cost
Computer Vision and Pattern Recognition (CVPR), 2025
Haiyang Mei
Pengyu Zhang
Mike Zheng Shou
VLM
205
2
0
02 Jun 2025
Reasoning Segmentation for Images and Videos: A Survey
Yiqing Shen
Chenjia Li
Fei Xiong
Jeong-O Jeong
Tianpeng Wang
Michael Latman
Mathias Unberath
VOS
404
8
0
24 May 2025
Learning Streaming Video Representation via Multitask Training
Yibin Yan
Jilan Xu
Shangzhe Di
Yikun Liu
Yudi Shi
Qirui Chen
Zeqian Li
Yifei Huang
Weidi Xie
CLL
452
3
0
28 Apr 2025
ZS-VCOS: Zero-Shot Video Camouflaged Object Segmentation By Optical Flow and Open Vocabulary Object Detection
Wenqi Guo
Mohamed Shehata
Shan Du
VLM
352
0
0
10 Apr 2025
Segment Anything, Even Occluded
Computer Vision and Pattern Recognition (CVPR), 2025
Wei-En Tai
Yu-Lin Shih
Cheng Sun
Y. Wang
Hwann-Tzong Chen
VLM
273
1
0
08 Mar 2025
Omni-RGPT: Unifying Image and Video Region-level Understanding via Token Marks
Computer Vision and Pattern Recognition (CVPR), 2025
Miran Heo
Min-Hung Chen
De-An Huang
Sifei Liu
Subhashree Radhakrishnan
Seon Joo Kim
Yu-Chun Wang
Ryo Hachiuma
ObjD
VLM
504
8
0
14 Jan 2025
EdgeTAM: On-Device Track Anything Model
Computer Vision and Pattern Recognition (CVPR), 2025
Chong Zhou
Chenchen Zhu
Yunyang Xiong
Saksham Suri
Fanyi Xiao
...
Raghuraman Krishnamoorthi
Bo Dai
Chen Change Loy
Vikas Chandra
Bilge Soran
VLM
284
8
0
13 Jan 2025
Towards Open-Vocabulary Video Semantic Segmentation
IEEE transactions on multimedia (IEEE TMM), 2024
Xuelong Li
Yun Liu
Guolei Sun
Min Wu
Le Zhang
Ce Zhu
VLM
VOS
307
2
0
12 Dec 2024
Advancing Myopia To Holism: Fully Contrastive Language-Image Pre-training
Computer Vision and Pattern Recognition (CVPR), 2024
Haicheng Wang
Chen Ju
Weixiong Lin
Shuai Xiao
Mengting Chen
...
Mingshuai Yao
Jinsong Lan
Ying Chen
Qingwen Liu
Yanfeng Wang
VLM
CLIP
342
8
0
30 Nov 2024
OVT-B: A New Large-Scale Benchmark for Open-Vocabulary Multi-Object Tracking
Neural Information Processing Systems (NeurIPS), 2024
Haiji Liang
Ruize Han
VLM
329
4
0
23 Oct 2024
Configurable Embodied Data Generation for Class-Agnostic RGB-D Video Segmentation
IEEE Robotics and Automation Letters (RA-L), 2024
Anthony Opipari
Aravindhan K. Krishnan
Shreekant Gayaka
Min Sun
Cheng-Hao Kuo
Arnie Sen
Odest Chadwicke Jenkins
VOS
230
1
0
16 Oct 2024
SAM 2: Segment Anything in Images and Videos
International Conference on Learning Representations (ICLR), 2024
Nikhila Ravi
Valentin Gabeur
Yuan-Ting Hu
Ronghang Hu
Chaitanya K. Ryali
...
Nicolas Carion
Chao-Yuan Wu
Ross B. Girshick
Piotr Dollár
Christoph Feichtenhofer
VLM
MLLM
470
2,093
0
01 Aug 2024
Open-Vocabulary Audio-Visual Semantic Segmentation
Zhenghao Zhang
Junchao Liao
Dantong Niu
Yanyu Qi
Menghao Li
Ji Shi
Bowei Xing
Xianghua Ying
VOS
VLM
214
17
0
31 Jul 2024
OCTrack: Benchmarking the Open-Corpus Multi-Object Tracking
Zekun Qian
Ruize Han
Wei Feng
Junhui Hou
Linqi Song
Song Wang
261
1
0
19 Jul 2024
ViLLa: Video Reasoning Segmentation with Large Language Model
Rongkun Zheng
Lu Qi
Xi Chen
Yi Wang
Kun Wang
Yu Qiao
Hengshuang Zhao
VOS
LRM
457
16
0
18 Jul 2024
VISA: Reasoning Video Object Segmentation via Large Language Models
Cilin Yan
Haochen Wang
Shilin Yan
Xiaolong Jiang
Yao Hu
Guoliang Kang
Weidi Xie
E. Gavves
LRM
VLM
VOS
212
91
0
16 Jul 2024
Unified Embedding Alignment for Open-Vocabulary Video Instance Segmentation
Hao Fang
Peng Wu
Yawei Li
Xinxin Zhang
Xiankai Lu
VLM
271
19
0
10 Jul 2024
DENOISER: Rethinking the Robustness for Open-Vocabulary Action Recognition
Haozhe Cheng
Chen Ju
Haicheng Wang
Jinxiang Liu
Mengting Chen
Qiang Hu
Xiaoyun Zhang
Yanfeng Wang
DiffM
VLM
214
8
0
23 Apr 2024
CLIP-VIS: Adapting CLIP for Open-Vocabulary Video Instance Segmentation
Wenqi Zhu
Jiale Cao
Jin Xie
Shuangming Yang
Yanwei Pang
VLM
CLIP
258
10
0
19 Mar 2024
Instance Brownian Bridge as Texts for Open-vocabulary Video Instance Segmentation
Ze-Long Cheng
Kehan Li
Hao Li
Peng Jin
Chang Liu
Xiawu Zheng
Rongrong Ji
Jie Chen
VOS
249
4
0
18 Jan 2024
General Object Foundation Model for Images and Videos at Scale
Computer Vision and Pattern Recognition (CVPR), 2023
Junfeng Wu
Yi Jiang
Qihao Liu
Zehuan Yuan
Xiang Bai
Song Bai
VOS
VLM
300
74
0
14 Dec 2023
A Survey on Open-Vocabulary Detection and Segmentation: Past, Present, and Future
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
Chaoyang Zhu
Long Chen
ObjD
VLM
483
64
0
18 Jul 2023
Towards Open Vocabulary Learning: A Survey
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
Jianzong Wu
Xiangtai Li
Shilin Xu
Haobo Yuan
Henghui Ding
...
Jiangning Zhang
Yu Tong
Xudong Jiang
Guohao Li
Dacheng Tao
ObjD
VLM
390
213
0
28 Jun 2023
OpenVIS: Open-vocabulary Video Instance Segmentation
AAAI Conference on Artificial Intelligence (AAAI), 2023
Pinxue Guo
Tony Huang
Peiyang He
Xuefeng Liu
Tianjun Xiao
Zhaoyu Chen
Wenqiang Zhang
VLM
183
24
0
26 May 2023
PiClick: Picking the desired mask from multiple candidates in click-based interactive segmentation
Neurocomputing (Neurocomputing), 2023
Cilin Yan
Haochen Wang
Jie Liu
Xiaolong Jiang
Yao Hu
Xu Tang
Guoliang Kang
E. Gavves
VLM
248
3
0
23 Apr 2023
1