Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2312.15715
Cited By
UniRef++: Segment Every Reference Object in Spatial and Temporal Spaces
25 December 2023
Jiannan Wu
Yi-Xin Jiang
Bin Yan
Huchuan Lu
Zehuan Yuan
Ping Luo
VOS
Re-assign community
ArXiv
PDF
HTML
Papers citing
"UniRef++: Segment Every Reference Object in Spatial and Temporal Spaces"
19 / 19 papers shown
Title
RESAnything: Attribute Prompting for Arbitrary Referring Segmentation
Ruiqi Wang
Hao Zhang
VLM
52
0
0
03 May 2025
Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos
Haobo Yuan
X. Li
Tao Zhang
Zilong Huang
Shilin Xu
S. Ji
Yunhai Tong
Lu Qi
Jiashi Feng
Ming Yang
VLM
89
11
0
07 Jan 2025
Motion-Grounded Video Reasoning: Understanding and Perceiving Motion at Pixel Level
Andong Deng
Tongjia Chen
Shoubin Yu
Taojiannan Yang
Lincoln Spencer
Yapeng Tian
Ajmal Saeed Mian
Mohit Bansal
Chen Chen
LRM
54
1
0
15 Nov 2024
X-Fi: A Modality-Invariant Foundation Model for Multimodal Human Sensing
Xinyan Chen
Jianfei Yang
28
1
0
14 Oct 2024
HiFi-CS: Towards Open Vocabulary Visual Grounding For Robotic Grasping Using Vision-Language Models
V. Bhat
P. Krishnamurthy
Ramesh Karri
Farshad Khorrami
42
3
0
16 Sep 2024
EVF-SAM: Early Vision-Language Fusion for Text-Prompted Segment Anything Model
Yuxuan Zhang
Tianheng Cheng
Lianghui Zhu
Lei Liu
Heng Liu
Longjin Ran
Xiaoxin Chen
Xiaoxin Chen
Wenyu Liu
Xinggang Wang
VLM
51
24
0
28 Jun 2024
UniVS: Unified and Universal Video Segmentation with Prompts as Queries
Ming-hui Li
Shuai Li
Xindong Zhang
Lei Zhang
VOS
33
16
0
28 Feb 2024
VLT: Vision-Language Transformer and Query Generation for Referring Segmentation
Henghui Ding
Chang Liu
Suchen Wang
Xudong Jiang
63
115
0
28 Oct 2022
UViM: A Unified Modeling Approach for Vision with Learned Guiding Codes
Alexander Kolesnikov
André Susano Pinto
Lucas Beyer
Xiaohua Zhai
Jeremiah Harmsen
N. Houlsby
103
67
0
20 May 2022
MulT: An End-to-End Multitask Learning Transformer
Deblina Bhattacharjee
Tong Zhang
Sabine Süsstrunk
Mathieu Salzmann
ViT
29
62
0
17 May 2022
Reliable Propagation-Correction Modulation for Video Object Segmentation
Xiaohao Xu
Jinglu Wang
Xiao Li
Yan Lu
VOS
35
61
0
06 Dec 2021
LAVT: Language-Aware Vision Transformer for Referring Image Segmentation
Zhao Yang
Jiaqi Wang
Yansong Tang
Kai-xiang Chen
Hengshuang Zhao
Philip H. S. Torr
133
308
0
04 Dec 2021
Hierarchical Memory Matching Network for Video Object Segmentation
Hongje Seong
Seoung Wug Oh
Joon-Young Lee
Seongwon Lee
Suhyeon Lee
Euntai Kim
VOS
39
103
0
23 Sep 2021
Pix2seq: A Language Modeling Framework for Object Detection
Ting-Li Chen
Saurabh Saxena
Lala Li
David J. Fleet
Geoffrey E. Hinton
MLLM
ViT
VLM
233
341
0
22 Sep 2021
Collaborative Video Object Segmentation by Multi-Scale Foreground-Background Integration
Zongxin Yang
Yunchao Wei
Yi Yang
VOS
33
160
0
13 Oct 2020
Multi-task Collaborative Network for Joint Referring Expression Comprehension and Segmentation
Gen Luo
Yiyi Zhou
Xiaoshuai Sun
Liujuan Cao
Chenglin Wu
Cheng Deng
Rongrong Ji
ObjD
159
282
0
19 Mar 2020
Conditional Convolutions for Instance Segmentation
Zhi Tian
Chunhua Shen
Hao Chen
ISeg
167
596
0
12 Mar 2020
Learning Fast and Robust Target Models for Video Object Segmentation
Andreas Robinson
Felix Järemo Lawin
Martin Danelljan
F. Khan
M. Felsberg
VOS
47
137
0
27 Feb 2020
Feature Pyramid Networks for Object Detection
Tsung-Yi Lin
Piotr Dollár
Ross B. Girshick
Kaiming He
Bharath Hariharan
Serge J. Belongie
ObjD
166
21,643
0
09 Dec 2016
1