Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1812.03426
Cited By
Real-Time Referring Expression Comprehension by Single-Stage Grounding Network
9 December 2018
Xinpeng Chen
Lin Ma
Jingyuan Chen
Zequn Jie
Wen Liu
Jiebo Luo
ObjD
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Real-Time Referring Expression Comprehension by Single-Stage Grounding Network"
50 / 60 papers shown
Title
Prototype-Aware Multimodal Alignment for Open-Vocabulary Visual Grounding
Jiangnan Xie
Xiaolong Zheng
Liang Zheng
ObjD
129
0
0
08 Sep 2025
ReMeREC: Relation-aware and Multi-entity Referring Expression Comprehension
Yizhi Hu
Zezhao Tian
Xingqun Qi
Chen Su
Bingkun Yang
Junhui Yin
Muyi Sun
Man Zhang
Zhenan Sun
ObjD
106
0
0
22 Jul 2025
DenseGrounding: Improving Dense Language-Vision Semantics for Ego-Centric 3D Visual Grounding
International Conference on Learning Representations (ICLR), 2025
Henry Zheng
Hao Shi
Qihang Peng
Yong Xien Chng
Rui Huang
Yepeng Weng
Peng Wang
Gao Huang
254
5
0
08 May 2025
Towards Visual Grounding: A Survey
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2024
Linhui Xiao
Xiaoshan Yang
X. Lan
Yaowei Wang
Changsheng Xu
ObjD
739
26
0
28 Dec 2024
To Predict or Not To Predict? Proportionally Masked Autoencoders for Tabular Data Imputation
Jungkyu Kim
Kibok Lee
Taeyoung Park
296
3
0
26 Dec 2024
Make Graph-based Referring Expression Comprehension Great Again through Expression-guided Dynamic Gating and Regression
IEEE transactions on multimedia (IEEE TMM), 2024
Jingcheng Ke
Dele Wang
Jun-Cheng Chen
I-Hong Jhuo
Chia-Wen Lin
Yen-Yu Lin
194
1
0
05 Sep 2024
NanoMVG: USV-Centric Low-Power Multi-Task Visual Grounding based on Prompt-Guided Camera and 4D mmWave Radar
Runwei Guan
Tao Huang
Liye Jia
Haocheng Zhao
Shanliang Yao
Xiaohui Zhu
Ka Lok Man
Eng Gee Lim
Jeremy S. Smith
Yutao Yue
327
7
0
30 Aug 2024
ResVG: Enhancing Relation and Semantic Understanding in Multiple Instances for Visual Grounding
ACM Multimedia (MM), 2024
Minghang Zheng
Jiahua Zhang
Qingchao Chen
Yuxin Peng
Yang Liu
ObjD
234
5
0
29 Aug 2024
Look Hear: Gaze Prediction for Speech-directed Human Attention
European Conference on Computer Vision (ECCV), 2024
Sounak Mondal
Seoyoung Ahn
Zhibo Yang
Niranjan Balasubramanian
Dimitris Samaras
G. Zelinsky
Minh Hoai
345
3
0
28 Jul 2024
Visual Grounding with Attention-Driven Constraint Balancing
Weitai Kang
Luowei Zhou
Junyi Wu
Changchang Sun
Yan Yan
161
8
0
03 Jul 2024
SegVG: Transferring Object Bounding Box to Segmentation for Visual Grounding
Weitai Kang
Gaowen Liu
Mubarak Shah
Yan Yan
ObjD
312
16
0
03 Jul 2024
WaterVG: Waterway Visual Grounding based on Text-Guided Vision and mmWave Radar
Runwei Guan
Liye Jia
Fengyufan Yang
Shanliang Yao
Erick Purwanto
...
Eng Gee Lim
Jeremy S. Smith
Ka Lok Man
Xuming Hu
Yutao Yue
291
16
0
19 Mar 2024
Beyond Literal Descriptions: Understanding and Locating Open-World Objects Aligned with Human Intentions
Wenxuan Wang
Yisi Zhang
Xingjian He
Yichen Yan
Zijia Zhao
Xinlong Wang
Jing Liu
LM&Ro
211
5
0
17 Feb 2024
Bridging Modality Gap for Visual Grounding with Effecitve Cross-modal Distillation
Chinese Conference on Pattern Recognition and Computer Vision (CPRCV), 2023
Jiaxi Wang
Wenhui Hu
Xueyang Liu
Beihu Wu
Yuting Qiu
Yingying Cai
207
1
0
29 Dec 2023
Context Disentangling and Prototype Inheriting for Robust Visual Grounding
Wei Tang
Liang Li
Xuejing Liu
Lu Jin
Jinhui Tang
Zechao Li
198
41
0
19 Dec 2023
Mono3DVG: 3D Visual Grounding in Monocular Images
AAAI Conference on Artificial Intelligence (AAAI), 2023
Yangfan Zhan
Yuan. Yuan
Zhitong Xiong
MDE
202
31
0
13 Dec 2023
GPT-4 Enhanced Multimodal Grounding for Autonomous Driving: Leveraging Cross-Modal Attention with Large Language Models
Haicheng Liao
Huanming Shen
Zhenning Li
Chengyue Wang
Guofa Li
Yiming Bie
Chengzhong Xu
194
79
0
06 Dec 2023
Augment the Pairs: Semantics-Preserving Image-Caption Pair Augmentation for Grounding-Based Vision and Language Models
IEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2023
Jingru Yi
Burak Uzkent
Oana Ignat
Zili Li
Amanmeet Garg
Xiang Yu
Linda Liu
VLM
215
2
0
05 Nov 2023
RIO: A Benchmark for Reasoning Intention-Oriented Objects in Open Environments
Neural Information Processing Systems (NeurIPS), 2023
Mengxue Qu
Yu-Huan Wu
Wu Liu
Xiaodan Liang
Jingkuan Song
Yao-Min Zhao
Yunchao Wei
163
19
0
26 Oct 2023
OV-VG: A Benchmark for Open-Vocabulary Visual Grounding
Chunlei Wang
Wenquan Feng
Xiangtai Li
Guangliang Cheng
Shuchang Lyu
Binghao Liu
Lijiang Chen
Qi Zhao
ObjD
VLM
216
15
0
22 Oct 2023
Towards Complex-query Referring Image Segmentation: A Novel Benchmark
Wei Ji
Li Li
Marco Pleines
Xiangyan Liu
Xu Yang
Juncheng Billy Li
Roger Zimmermann
138
12
0
29 Sep 2023
Spatial and Visual Perspective-Taking via View Rotation and Relation Reasoning for Embodied Reference Understanding
European Conference on Computer Vision (ECCV), 2023
Cheng Shi
Sibei Yang
LRM
138
12
0
03 Sep 2023
GREC: Generalized Referring Expression Comprehension
Shuting He
Henghui Ding
Chang Liu
Xudong Jiang
ObjD
212
34
0
30 Aug 2023
EAVL: Explicitly Align Vision and Language for Referring Image Segmentation
Yimin Yan
Xingjian He
Wenxuan Wang
Sihan Chen
Qingbin Liu
ObjD
VLM
238
2
0
18 Aug 2023
Iterative Robust Visual Grounding with Masked Reference based Centerpoint Supervision
Menghao Li
Chunlei Wang
W. Feng
Shuchang Lyu
Guangliang Cheng
Xiangtai Li
Binghao Liu
Qi Zhao
247
6
0
23 Jul 2023
TreePrompt: Learning to Compose Tree Prompts for Explainable Visual Grounding
Chenchi Zhang
Jun Xiao
Lei Chen
Jian Shao
Long Chen
VLM
LRM
143
2
0
19 May 2023
CLIP-VG: Self-paced Curriculum Adapting of CLIP for Visual Grounding
IEEE transactions on multimedia (IEEE TMM), 2023
Linhui Xiao
Xiaoshan Yang
Fang Peng
Ming Yan
Yaowei Wang
Changsheng Xu
ObjD
VLM
332
55
0
15 May 2023
DQ-DETR: Dual Query Detection Transformer for Phrase Extraction and Grounding
AAAI Conference on Artificial Intelligence (AAAI), 2022
Siyi Liu
Yaoyuan Liang
Feng Li
Shijia Huang
Hao Zhang
Hang Su
Jun Zhu
Lei Zhang
ObjD
198
39
0
28 Nov 2022
YORO -- Lightweight End to End Visual Grounding
Chih-Hui Ho
Srikar Appalaraju
Bhavan A. Jasani
R. Manmatha
Nuno Vasconcelos
ObjD
144
26
0
15 Nov 2022
VLT: Vision-Language Transformer and Query Generation for Referring Segmentation
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022
Henghui Ding
Chang Liu
Suchen Wang
Xudong Jiang
251
151
0
28 Oct 2022
RSVG: Exploring Data and Models for Visual Grounding on Remote Sensing Data
IEEE Transactions on Geoscience and Remote Sensing (IEEE TGRS), 2022
Yangfan Zhan
Zhitong Xiong
Yuan. Yuan
197
170
0
23 Oct 2022
Dynamic MDETR: A Dynamic Multimodal Transformer Decoder for Visual Grounding
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022
Fengyuan Shi
Ruopeng Gao
Weilin Huang
Limin Wang
166
42
0
28 Sep 2022
One for All: One-stage Referring Expression Comprehension with Dynamic Reasoning
Neurocomputing (Neurocomputing), 2022
Zhipeng Zhang
Zhimin Wei
Zhongzhen Huang
Rui Niu
Peng Wang
ObjD
LRM
204
9
0
31 Jul 2022
SiRi: A Simple Selective Retraining Mechanism for Transformer-based Visual Grounding
European Conference on Computer Vision (ECCV), 2022
Mengxue Qu
Yu Wu
Wu Liu
Qiqi Gong
Xiaodan Liang
Olga Russakovsky
Yao Zhao
Yunchao Wei
ObjD
101
26
0
27 Jul 2022
Bear the Query in Mind: Visual Grounding with Query-conditioned Convolution
Chonghan Chen
Qi Jiang1
Chih-Hao Wang
Noel Chen
Haohan Wang
Xiang Li
Bhiksha Raj
ObjD
244
0
0
18 Jun 2022
TransVG++: End-to-End Visual Grounding with Language Conditioned Vision Transformer
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022
Jiajun Deng
Zhengyuan Yang
Daqing Liu
Tianlang Chen
Wen-gang Zhou
Yanyong Zhang
Houqiang Li
Wanli Ouyang
ViT
200
86
0
14 Jun 2022
Improving Visual Grounding with Visual-Linguistic Verification and Iterative Reasoning
Computer Vision and Pattern Recognition (CVPR), 2022
Li Yang
Yan Xu
Chunfen Yuan
Wei Liu
Bing Li
Weiming Hu
ObjD
221
151
0
30 Apr 2022
3D-SPS: Single-Stage 3D Visual Grounding via Referred Point Progressive Selection
Computer Vision and Pattern Recognition (CVPR), 2022
Jun-Bin Luo
Jiahui Fu
Xianghao Kong
Chen Gao
Haibing Ren
Hao Shen
Huaxia Xia
Si Liu
173
121
0
13 Apr 2022
FindIt: Generalized Localization with Natural Language Queries
European Conference on Computer Vision (ECCV), 2022
Weicheng Kuo
Fred Bertsch
Wei Li
A. Piergiovanni
M. Saffar
A. Angelova
ObjD
174
18
0
31 Mar 2022
Pseudo-Q: Generating Pseudo Language Queries for Visual Grounding
Computer Vision and Pattern Recognition (CVPR), 2022
Haojun Jiang
Yuanze Lin
Dongchen Han
Shiji Song
Gao Huang
ObjD
248
64
0
16 Mar 2022
Suspected Object Matters: Rethinking Model's Prediction for One-stage Visual Grounding
ACM Multimedia (ACM MM), 2022
Yang Jiao
Zequn Jie
Yue Yu
Lin Ma
Yu-Gang Jiang
OOD
143
9
0
10 Mar 2022
Deconfounded Visual Grounding
AAAI Conference on Artificial Intelligence (AAAI), 2021
Jianqiang Huang
Yu Qin
Jiaxin Qi
Qianru Sun
Hanwang Zhang
CML
ObjD
127
37
0
31 Dec 2021
Bottom Up Top Down Detection Transformers for Language Grounding in Images and Point Clouds
Ayush Jain
N. Gkanatsios
Ishita Mediratta
Katerina Fragkiadaki
ObjD
381
144
0
16 Dec 2021
Word2Pix: Word to Pixel Cross Attention Transformer in Visual Grounding
IEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2021
Heng Zhao
Qiufeng Wang
Yew-Soon Ong
ObjD
134
33
0
31 Jul 2021
Bridging the Gap Between Object Detection and User Intent via Query-Modulation
Marco Fornoni
Chaochao Yan
Liangchen Luo
Kimberly Wilber
A. Stark
Huayu Chen
Boqing Gong
Andrew G. Howard
ObjD
111
1
0
18 Jun 2021
Referring Transformer: A One-step Approach to Multi-task Visual Grounding
Neural Information Processing Systems (NeurIPS), 2021
Muchen Li
Leonid Sigal
ObjD
220
236
0
06 Jun 2021
VL-NMS: Breaking Proposal Bottlenecks in Two-Stage Visual-Language Matching
Chenchi Zhang
Wenbo Ma
Jun Xiao
Hanwang Zhang
Jian Shao
Yueting Zhuang
Long Chen
193
5
0
12 May 2021
Proposal-free One-stage Referring Expression via Grid-Word Cross-Attention
International Joint Conference on Artificial Intelligence (IJCAI), 2021
Wei Suo
Mengyang Sun
Peng Wang
Qi Wu
ObjD
138
14
0
05 May 2021
TransVG: End-to-End Visual Grounding with Transformers
IEEE International Conference on Computer Vision (ICCV), 2021
Jiajun Deng
Zhengyuan Yang
Tianlang Chen
Wen-gang Zhou
Houqiang Li
ViT
411
432
0
17 Apr 2021
PPGN: Phrase-Guided Proposal Generation Network For Referring Expression Comprehension
Chao Yang
Guoqing Wang
Dongsheng Li
Huawei Shen
Su Feng
Bin Jiang
92
3
0
20 Dec 2020
1
2
Next