ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2304.01715
  4. Cited By
Towards Open-Vocabulary Video Instance Segmentation

Towards Open-Vocabulary Video Instance Segmentation

4 April 2023
Haochen Wang
Cilin Yan
Shuailong Wang
Xiaolong Jiang
XU Tang
Yao Hu
Weidi Xie
E. Gavves
    VOS
    VLM
ArXivPDFHTML

Papers citing "Towards Open-Vocabulary Video Instance Segmentation"

25 / 25 papers shown
Title
Learning Streaming Video Representation via Multitask Training
Learning Streaming Video Representation via Multitask Training
Yibin Yan
Jilan Xu
Shangzhe Di
Yikun Liu
Yudi Shi
Qirui Chen
Zeqian Li
Yifei Huang
Weidi Xie
CLL
82
0
0
28 Apr 2025
ZS-VCOS: Zero-Shot Outperforms Supervised Video Camouflaged Object Segmentation
ZS-VCOS: Zero-Shot Outperforms Supervised Video Camouflaged Object Segmentation
Wenqi Guo
Shan Du
VLM
52
0
0
10 Apr 2025
Segment Anything, Even Occluded
Wei-En Tai
Yu-Lin Shih
Cheng Sun
Y. Wang
Hwann-Tzong Chen
VLM
60
0
0
08 Mar 2025
Omni-RGPT: Unifying Image and Video Region-level Understanding via Token Marks
Omni-RGPT: Unifying Image and Video Region-level Understanding via Token Marks
Miran Heo
Min-Hung Chen
De-An Huang
Sifei Liu
Subhashree Radhakrishnan
Seon Joo Kim
Yu-Chun Wang
Ryo Hachiuma
ObjD
VLM
127
2
0
14 Jan 2025
EdgeTAM: On-Device Track Anything Model
EdgeTAM: On-Device Track Anything Model
Chong Zhou
Chenchen Zhu
Yunyang Xiong
Saksham Suri
Fanyi Xiao
...
Raghuraman Krishnamoorthi
Bo Dai
Chen Change Loy
Vikas Chandra
Bilge Soran
VLM
58
0
0
13 Jan 2025
Towards Open-Vocabulary Video Semantic Segmentation
Towards Open-Vocabulary Video Semantic Segmentation
X. Li
Yun Liu
Guolei Sun
Min Wu
Le Zhang
Ce Zhu
VLM
VOS
85
1
0
12 Dec 2024
Advancing Myopia To Holism: Fully Contrastive Language-Image
  Pre-training
Advancing Myopia To Holism: Fully Contrastive Language-Image Pre-training
Haicheng Wang
Chen Ju
Weixiong Lin
Shuai Xiao
Mengting Chen
...
Mingshuai Yao
Jinsong Lan
Ying Chen
Qingwen Liu
Yanfeng Wang
VLM
CLIP
70
4
0
30 Nov 2024
OVT-B: A New Large-Scale Benchmark for Open-Vocabulary Multi-Object
  Tracking
OVT-B: A New Large-Scale Benchmark for Open-Vocabulary Multi-Object Tracking
Haiji Liang
Ruize Han
VLM
21
1
0
23 Oct 2024
Configurable Embodied Data Generation for Class-Agnostic RGB-D Video
  Segmentation
Configurable Embodied Data Generation for Class-Agnostic RGB-D Video Segmentation
Anthony Opipari
Aravindhan K. Krishnan
Shreekant Gayaka
Min Sun
Cheng-Hao Kuo
Arnie Sen
Odest Chadwicke Jenkins
VOS
36
0
0
16 Oct 2024
SAM 2: Segment Anything in Images and Videos
SAM 2: Segment Anything in Images and Videos
Nikhila Ravi
Valentin Gabeur
Yuan-Ting Hu
Ronghang Hu
Chaitanya K. Ryali
...
Nicolas Carion
Chao-Yuan Wu
Ross B. Girshick
Piotr Dollár
Christoph Feichtenhofer
VLM
MLLM
31
705
0
01 Aug 2024
Open-Vocabulary Audio-Visual Semantic Segmentation
Open-Vocabulary Audio-Visual Semantic Segmentation
Zhenghao Zhang
Junchao Liao
Dantong Niu
Yanyu Qi
Menghao Li
Ji Shi
Bowei Xing
Xianghua Ying
VOS
VLM
32
7
0
31 Jul 2024
OCTrack: Benchmarking the Open-Corpus Multi-Object Tracking
OCTrack: Benchmarking the Open-Corpus Multi-Object Tracking
Zekun Qian
Ruize Han
Wei Feng
Junhui Hou
Linqi Song
Song Wang
32
1
0
19 Jul 2024
ViLLa: Video Reasoning Segmentation with Large Language Model
ViLLa: Video Reasoning Segmentation with Large Language Model
Rongkun Zheng
Lu Qi
Xi Chen
Yi Wang
Kun Wang
Yu Qiao
Hengshuang Zhao
VOS
LRM
52
2
0
18 Jul 2024
VISA: Reasoning Video Object Segmentation via Large Language Models
VISA: Reasoning Video Object Segmentation via Large Language Models
Cilin Yan
Haochen Wang
Shilin Yan
Xiaolong Jiang
Yao Hu
Guoliang Kang
Weidi Xie
E. Gavves
LRM
VLM
VOS
37
28
0
16 Jul 2024
Unified Embedding Alignment for Open-Vocabulary Video Instance
  Segmentation
Unified Embedding Alignment for Open-Vocabulary Video Instance Segmentation
Hao Fang
Peng Wu
Yawei Li
Xinxin Zhang
Xiankai Lu
VLM
22
6
0
10 Jul 2024
DENOISER: Rethinking the Robustness for Open-Vocabulary Action
  Recognition
DENOISER: Rethinking the Robustness for Open-Vocabulary Action Recognition
Haozhe Cheng
Chen Ju
Haicheng Wang
Jinxiang Liu
Mengting Chen
Qiang Hu
Xiaoyun Zhang
Yanfeng Wang
DiffM
VLM
38
5
0
23 Apr 2024
CLIP-VIS: Adapting CLIP for Open-Vocabulary Video Instance Segmentation
CLIP-VIS: Adapting CLIP for Open-Vocabulary Video Instance Segmentation
Wenqi Zhu
Jiale Cao
Jin Xie
Shuangming Yang
Yanwei Pang
VLM
CLIP
37
2
0
19 Mar 2024
Instance Brownian Bridge as Texts for Open-vocabulary Video Instance
  Segmentation
Instance Brownian Bridge as Texts for Open-vocabulary Video Instance Segmentation
Ze-Long Cheng
Kehan Li
Hao Li
Peng Jin
Chang Liu
Xiawu Zheng
Rongrong Ji
Jie Chen
VOS
28
2
0
18 Jan 2024
General Object Foundation Model for Images and Videos at Scale
General Object Foundation Model for Images and Videos at Scale
Junfeng Wu
Yi-Xin Jiang
Qihao Liu
Zehuan Yuan
Xiang Bai
Song Bai
VOS
VLM
25
38
0
14 Dec 2023
A Survey on Open-Vocabulary Detection and Segmentation: Past, Present,
  and Future
A Survey on Open-Vocabulary Detection and Segmentation: Past, Present, and Future
Chaoyang Zhu
Long Chen
ObjD
VLM
24
32
0
18 Jul 2023
Towards Open Vocabulary Learning: A Survey
Towards Open Vocabulary Learning: A Survey
Jianzong Wu
Xiangtai Li
Shilin Xu
Haobo Yuan
Henghui Ding
...
Jiangning Zhang
Yu Tong
Xudong Jiang
Bernard Ghanem
Dacheng Tao
ObjD
VLM
27
134
0
28 Jun 2023
OpenVIS: Open-vocabulary Video Instance Segmentation
OpenVIS: Open-vocabulary Video Instance Segmentation
Pinxue Guo
Tony Huang
Peiyang He
Xuefeng Liu
Tianjun Xiao
Zhaoyu Chen
Wenqiang Zhang
VLM
33
16
0
26 May 2023
PiClick: Picking the desired mask from multiple candidates in
  click-based interactive segmentation
PiClick: Picking the desired mask from multiple candidates in click-based interactive segmentation
Cilin Yan
Haochen Wang
Jie Liu
Xiaolong Jiang
Yao Hu
Xu Tang
Guoliang Kang
E. Gavves
VLM
22
0
0
23 Apr 2023
Open-vocabulary Object Detection via Vision and Language Knowledge
  Distillation
Open-vocabulary Object Detection via Vision and Language Knowledge Distillation
Xiuye Gu
Tsung-Yi Lin
Weicheng Kuo
Yin Cui
VLM
ObjD
223
897
0
28 Apr 2021
STEm-Seg: Spatio-temporal Embeddings for Instance Segmentation in Videos
STEm-Seg: Spatio-temporal Embeddings for Instance Segmentation in Videos
A. Athar
Sabarinath Mahadevan
Aljosa Osep
Laura Leal-Taixé
Bastian Leibe
VOS
70
170
0
18 Mar 2020
1