Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1608.00525
Cited By
Modeling Context Between Objects for Referring Expression Understanding
1 August 2016
Varun K. Nagaraja
Vlad I. Morariu
Larry S. Davis
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Modeling Context Between Objects for Referring Expression Understanding"
24 / 24 papers shown
Title
RESAnything: Attribute Prompting for Arbitrary Referring Segmentation
Ruiqi Wang
Hao Zhang
VLM
56
0
0
03 May 2025
LGD: Leveraging Generative Descriptions for Zero-Shot Referring Image Segmentation
Jiachen Li
Qing Xie
Xiaohan Yu
Hongyun Wang
Jinyu Xu
Yongjian Liu
ObjD
76
0
0
20 Apr 2025
Aligning Generative Denoising with Discriminative Objectives Unleashes Diffusion for Visual Perception
Ziqi Pang
Xin Xu
Yu-Xiong Wang
DiffM
65
0
0
15 Apr 2025
Hybrid Global-Local Representation with Augmented Spatial Guidance for Zero-Shot Referring Image Segmentation
Ting Liu
Siyuan Li
44
0
0
01 Apr 2025
GroundingSuite: Measuring Complex Multi-Granular Pixel Grounding
R. Hu
Lianghui Zhu
Yuxuan Zhang
Tianheng Cheng
Lei Liu
Heng Liu
Longjin Ran
Xiaoxin Chen
Wenyu Liu
Xinggang Wang
ObjD
56
0
0
13 Mar 2025
SwimVG: Step-wise Multimodal Fusion and Adaption for Visual Grounding
Liangtao Shi
Ting Liu
Xiantao Hu
Yue Hu
Quanjun Yin
Richang Hong
ObjD
46
0
0
24 Feb 2025
MQADet: A Plug-and-Play Paradigm for Enhancing Open-Vocabulary Object Detection via Multimodal Question Answering
Caixiong Li
Xiongwei Zhao
Jinhang Zhang
Xing Zhang
Qihao Sun
Zhou Wu
ObjD
MLLM
VLM
51
0
0
23 Feb 2025
AeroReformer: Aerial Referring Transformer for UAV-based Referring Image Segmentation
Rui Li
Xiaowei Zhao
54
0
0
23 Feb 2025
Multi-task Visual Grounding with Coarse-to-Fine Consistency Constraints
Ming Dai
Jian Li
Jiedong Zhuang
Xian Zhang
Wankou Yang
ObjD
42
1
0
12 Jan 2025
MaPPER: Multimodal Prior-guided Parameter Efficient Tuning for Referring Expression Comprehension
Ting Liu
Zunnan Xu
Yue Hu
Liangtao Shi
Zhiqiang Wang
Quanjun Yin
57
2
0
03 Jan 2025
Paint Outside the Box: Synthesizing and Selecting Training Data for Visual Grounding
Zilin Du
Haoxin Li
Jianfei Yu
Boyang Li
143
0
0
01 Dec 2024
SAMWISE: Infusing Wisdom in SAM2 for Text-Driven Video Segmentation
Claudia Cuttano
Gabriele Trivigno
Gabriele Rosi
Carlo Masone
Giuseppe Averta
VOS
106
2
0
26 Nov 2024
A Parameter-Efficient Tuning Framework for Language-guided Object Grounding and Robot Grasping
Houjian Yu
Mingen Li
Alireza Rezazadeh
Yang Yang
Changhyun Choi
42
1
0
28 Sep 2024
HiFi-CS: Towards Open Vocabulary Visual Grounding For Robotic Grasping Using Vision-Language Models
V. Bhat
P. Krishnamurthy
Ramesh Karri
Farshad Khorrami
44
3
0
16 Sep 2024
Learning Visual Grounding from Generative Vision and Language Model
Shijie Wang
Dahun Kim
A. Taalimi
Chen Sun
Weicheng Kuo
ObjD
34
5
0
18 Jul 2024
EVF-SAM: Early Vision-Language Fusion for Text-Prompted Segment Anything Model
Yuxuan Zhang
Tianheng Cheng
Lianghui Zhu
Lei Liu
Heng Liu
Longjin Ran
Xiaoxin Chen
Xiaoxin Chen
Wenyu Liu
Xinggang Wang
VLM
53
24
0
28 Jun 2024
F-LMM: Grounding Frozen Large Multimodal Models
Size Wu
Sheng Jin
Wenwei Zhang
Lumin Xu
Wentao Liu
Wei Li
Chen Change Loy
MLLM
73
12
0
09 Jun 2024
RESMatch: Referring Expression Segmentation in a Semi-Supervised Manner
Ying-Dong Zang
Chenglong Fu
Runlong Cao
Didi Zhu
Min Zhang
Wenjun Hu
Lanyun Zhu
Tianrun Chen
30
6
0
08 Feb 2024
Mask Grounding for Referring Image Segmentation
Yong Xien Chng
Henry Zheng
Yizeng Han
Xuchong Qiu
Gao Huang
ISeg
ObjD
24
15
0
19 Dec 2023
Language-Guided Diffusion Model for Visual Grounding
Sijia Chen
Baochun Li
27
5
0
18 Aug 2023
Extending CLIP's Image-Text Alignment to Referring Image Segmentation
Seoyeon Kim
Minguk Kang
Dongwon Kim
Jaesik Park
Suha Kwak
VLM
25
10
0
14 Jun 2023
Improving Visual Grounding with Visual-Linguistic Verification and Iterative Reasoning
Li Yang
Yan Xu
Chunfen Yuan
Wei Liu
Bing Li
Weiming Hu
ObjD
34
113
0
30 Apr 2022
TubeDETR: Spatio-Temporal Video Grounding with Transformers
Antoine Yang
Antoine Miech
Josef Sivic
Ivan Laptev
Cordelia Schmid
ViT
28
94
0
30 Mar 2022
Aligning Linguistic Words and Visual Semantic Units for Image Captioning
Longteng Guo
Jing Liu
Jinhui Tang
Jiangwei Li
W. Luo
Hanqing Lu
14
102
0
06 Aug 2019
1