Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2304.01715
Cited By
v1
v2 (latest)
Towards Open-Vocabulary Video Instance Segmentation
IEEE International Conference on Computer Vision (ICCV), 2023
4 April 2023
Haochen Wang
Cilin Yan
Shuailong Wang
Xiaolong Jiang
XU Tang
Yao Hu
Weidi Xie
E. Gavves
VOS
VLM
Re-assign community
ArXiv (abs)
PDF
HTML
Github (87★)
Papers citing
"Towards Open-Vocabulary Video Instance Segmentation"
33 / 33 papers shown
ReVSeg: Incentivizing the Reasoning Chain for Video Segmentation with Reinforcement Learning
Y. Li
Yingda Yin
Lingting Zhu
Weikai Chen
Shengju Qian
Xin Wang
Yanwei Fu
VOS
LRM
384
0
0
02 Dec 2025
Explicit Memory through Online 3D Gaussian Splatting Improves Class-Agnostic Video Segmentation
IEEE Robotics and Automation Letters (IEEE RA-L), 2025
Anthony Opipari
Aravindhan K. Krishnan
Shreekant Gayaka
Min Sun
Cheng-Hao Kuo
Arnie Sen
Odest Chadwicke Jenkins
VOS
3DGS
331
0
0
27 Oct 2025
MaskCaptioner: Learning to Jointly Segment and Caption Object Trajectories in Videos
Gabriel Fiastre
Antoine Yang
Cordelia Schmid
VOS
446
1
0
16 Oct 2025
Image-to-Video Transfer Learning based on Image-Language Foundation Models: A Comprehensive Survey
Jinxuan Li
Chaolei Tan
Haoxuan Chen
Jianxin Ma
Jian-Fang Hu
Wei-Shi Zheng
Jianhuang Lai
VLM
146
1
0
12 Oct 2025
UniPixel: Unified Object Referring and Segmentation for Pixel-Level Visual Reasoning
Ye Liu
Zongyang Ma
Junfu Pu
Zhongang Qi
Yang Wu
Mingyu Ding
Chang Wen Chen
MLLM
ObjD
LRM
375
2
0
22 Sep 2025
Generalized Decoupled Learning for Enhancing Open-Vocabulary Dense Perception
Junjie Wang
Keyu Chen
Yulin Li
Bin Chen
Hengshuang Zhao
Xiaojuan Qi
Zhuotao Tian
CLIP
VLM
130
1
0
15 Aug 2025
A Comprehensive Survey on Video Scene Parsing:Advances, Challenges, and Prospects
Guohuan Xie
Syed Ariff Syed Hesham
Wenya Guo
Bing Li
Ming-Ming Cheng
Guolei Sun
Yun-Hai Liu
166
1
0
16 Jun 2025
SAM2Auto: Auto Annotation Using FLASH
Arash Rocky
Q.M. Jonathan Wu
VGen
VLM
219
0
0
09 Jun 2025
SAM-I2V: Upgrading SAM to Support Promptable Video Segmentation with Less than 0.2% Training Cost
Computer Vision and Pattern Recognition (CVPR), 2025
Haiyang Mei
Pengyu Zhang
Mike Zheng Shou
VLM
247
2
0
02 Jun 2025
Reasoning Segmentation for Images and Videos: A Survey
Yiqing Shen
Chenjia Li
Fei Xiong
Jeong-O Jeong
Tianpeng Wang
Michael Latman
Mathias Unberath
VOS
423
9
0
24 May 2025
Learning Streaming Video Representation via Multitask Training
Yibin Yan
Jilan Xu
Shangzhe Di
Yikun Liu
Yudi Shi
Qirui Chen
Zeqian Li
Yifei Huang
Weidi Xie
CLL
498
3
0
28 Apr 2025
ZS-VCOS: Zero-Shot Video Camouflaged Object Segmentation By Optical Flow and Open Vocabulary Object Detection
Wenqi Guo
Mohamed Shehata
Shan Du
VLM
434
0
0
10 Apr 2025
Segment Anything, Even Occluded
Computer Vision and Pattern Recognition (CVPR), 2025
Wei-En Tai
Yu-Lin Shih
Cheng Sun
Y. Wang
Hwann-Tzong Chen
VLM
309
2
0
08 Mar 2025
Omni-RGPT: Unifying Image and Video Region-level Understanding via Token Marks
Computer Vision and Pattern Recognition (CVPR), 2025
Miran Heo
Min-Hung Chen
De-An Huang
Sifei Liu
Subhashree Radhakrishnan
Seon Joo Kim
Yu-Chun Wang
Ryo Hachiuma
ObjD
VLM
529
9
0
14 Jan 2025
EdgeTAM: On-Device Track Anything Model
Computer Vision and Pattern Recognition (CVPR), 2025
Chong Zhou
Chenchen Zhu
Yunyang Xiong
Saksham Suri
Fanyi Xiao
...
Raghuraman Krishnamoorthi
Bo Dai
Chen Change Loy
Vikas Chandra
Bilge Soran
VLM
309
8
0
13 Jan 2025
Towards Open-Vocabulary Video Semantic Segmentation
IEEE transactions on multimedia (IEEE TMM), 2024
Xuelong Li
Yun-Hai Liu
Guolei Sun
Min Wu
Le Zhang
Ce Zhu
VLM
VOS
334
2
0
12 Dec 2024
Advancing Myopia To Holism: Fully Contrastive Language-Image Pre-training
Computer Vision and Pattern Recognition (CVPR), 2024
Haicheng Wang
Chen Ju
Weixiong Lin
Shuai Xiao
Mengting Chen
...
Mingshuai Yao
Jinsong Lan
Ying Chen
Qingwen Liu
Yanfeng Wang
VLM
CLIP
370
8
0
30 Nov 2024
OVT-B: A New Large-Scale Benchmark for Open-Vocabulary Multi-Object Tracking
Neural Information Processing Systems (NeurIPS), 2024
Haiji Liang
Ruize Han
VLM
349
4
0
23 Oct 2024
Configurable Embodied Data Generation for Class-Agnostic RGB-D Video Segmentation
IEEE Robotics and Automation Letters (RA-L), 2024
Anthony Opipari
Aravindhan K. Krishnan
Shreekant Gayaka
Min Sun
Cheng-Hao Kuo
Arnie Sen
Odest Chadwicke Jenkins
VOS
255
1
0
16 Oct 2024
SAM 2: Segment Anything in Images and Videos
International Conference on Learning Representations (ICLR), 2024
Nikhila Ravi
Valentin Gabeur
Yuan-Ting Hu
Ronghang Hu
Chaitanya K. Ryali
...
Nicolas Carion
Chao-Yuan Wu
Ross B. Girshick
Piotr Dollár
Christoph Feichtenhofer
VLM
MLLM
492
2,187
0
01 Aug 2024
Open-Vocabulary Audio-Visual Semantic Segmentation
Zhenghao Zhang
Junchao Liao
Dantong Niu
Yanyu Qi
Menghao Li
Ji Shi
Bowei Xing
Xianghua Ying
VOS
VLM
250
18
0
31 Jul 2024
OCTrack: Benchmarking the Open-Corpus Multi-Object Tracking
Zekun Qian
Ruize Han
Wei Feng
Junhui Hou
Linqi Song
Song Wang
277
1
0
19 Jul 2024
ViLLa: Video Reasoning Segmentation with Large Language Model
Rongkun Zheng
Lu Qi
Xi Chen
Yi Wang
Kun Wang
Yu Qiao
Hengshuang Zhao
VOS
LRM
506
16
0
18 Jul 2024
VISA: Reasoning Video Object Segmentation via Large Language Models
Cilin Yan
Haochen Wang
Shilin Yan
Xiaolong Jiang
Yao Hu
Guoliang Kang
Weidi Xie
E. Gavves
LRM
VLM
VOS
237
92
0
16 Jul 2024
Unified Embedding Alignment for Open-Vocabulary Video Instance Segmentation
Hao Fang
Peng Wu
Yawei Li
Xinxin Zhang
Xiankai Lu
VLM
331
19
0
10 Jul 2024
DENOISER: Rethinking the Robustness for Open-Vocabulary Action Recognition
Haozhe Cheng
Chen Ju
Haicheng Wang
Jinxiang Liu
Mengting Chen
Qiang Hu
Xiaoyun Zhang
Yanfeng Wang
DiffM
VLM
232
8
0
23 Apr 2024
CLIP-VIS: Adapting CLIP for Open-Vocabulary Video Instance Segmentation
Wenqi Zhu
Jiale Cao
Jin Xie
Shuangming Yang
Yanwei Pang
VLM
CLIP
290
10
0
19 Mar 2024
Instance Brownian Bridge as Texts for Open-vocabulary Video Instance Segmentation
Ze-Long Cheng
Kehan Li
Hao Li
Peng Jin
Chang Liu
Xiawu Zheng
Rongrong Ji
Jie Chen
VOS
268
4
0
18 Jan 2024
General Object Foundation Model for Images and Videos at Scale
Computer Vision and Pattern Recognition (CVPR), 2023
Junfeng Wu
Yi Jiang
Qihao Liu
Zehuan Yuan
Xiang Bai
Song Bai
VOS
VLM
339
79
0
14 Dec 2023
A Survey on Open-Vocabulary Detection and Segmentation: Past, Present, and Future
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
Chaoyang Zhu
Long Chen
ObjD
VLM
510
67
0
18 Jul 2023
Towards Open Vocabulary Learning: A Survey
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
Jianzong Wu
Xiangtai Li
Shilin Xu
Haobo Yuan
Henghui Ding
...
Jiangning Zhang
Yu Tong
Xudong Jiang
Guohao Li
Dacheng Tao
ObjD
VLM
406
218
0
28 Jun 2023
OpenVIS: Open-vocabulary Video Instance Segmentation
AAAI Conference on Artificial Intelligence (AAAI), 2023
Pinxue Guo
Tony Huang
Peiyang He
Xuefeng Liu
Tianjun Xiao
Zhaoyu Chen
Wenqiang Zhang
VLM
226
24
0
26 May 2023
PiClick: Picking the desired mask from multiple candidates in click-based interactive segmentation
Neurocomputing (Neurocomputing), 2023
Cilin Yan
Haochen Wang
Jie Liu
Xiaolong Jiang
Yao Hu
Xu Tang
Guoliang Kang
E. Gavves
VLM
273
4
0
23 Apr 2023
1