Zero-Shot Grounding of Objects from Natural Language Queries

IEEE International Conference on Computer Vision (ICCV), 2019

20 August 2019

Papers citing "Zero-Shot Grounding of Objects from Natural Language Queries"

50 / 90 papers shown

GeoViS: Geospatially Rewarded Visual Search for Remote Sensing Visual Grounding

233

02 Dec 2025

Enhancing Adversarial Transferability in Visual-Language Pre-training Models via Local Shuffle and Sample-based AttackNorth American Chapter of the Association for Computational Linguistics (NAACL), 2025

Xin Liu

Aoyang Zhou

AAML

148

02 Nov 2025

Referring Expression Comprehension for Small Objects

180

04 Oct 2025

Prototype-Aware Multimodal Alignment for Open-Vocabulary Visual Grounding

194

08 Sep 2025

A Coarse-to-Fine Approach to Multi-Modality 3D Occupancy Grounding

350

02 Aug 2025

Talk2Event: Grounded Understanding of Dynamic Scenes from Event Cameras

372

23 Jul 2025

RemoteSAM: Towards Segment Anything for Earth Observation

833

23 May 2025

Towards Visual Grounding: A SurveyIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2024

1.1K

28 Dec 2024

Automatic Prompt Generation and Grounding Object Detection for Zero-Shot Image Anomaly DetectionAsia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 2024

285

28 Nov 2024

AD-DINO: Attention-Dynamic DINO for Distance-Aware Embodied Reference Understanding

301

13 Nov 2024

Multi-Object 3D Grounding with Dynamic Modules and Language-Informed Spatial AttentionNeural Information Processing Systems (NeurIPS), 2024

Haomeng Zhang

Chiao-An Yang

Raymond A. Yeh

309

29 Oct 2024

Joint Top-Down and Bottom-Up Frameworks for 3D Visual GroundingInternational Conference on Pattern Recognition (ICPR), 2024

Yang Liu

Daizong Liu

Wei Hu

3DPC

425

21 Oct 2024

ResVG: Enhancing Relation and Semantic Understanding in Multiple Instances for Visual GroundingACM Multimedia (MM), 2024

Minghang Zheng

Jiahua Zhang

Qingchao Chen

Yuxin Peng

Yang Liu

ObjD

337

29 Aug 2024

R2G: Reasoning to Ground in 3D ScenesPattern Recognition (Pattern Recogn.), 2024

Yixuan Li

Zan Wang

Wei Liang

365

24 Aug 2024

Tell Codec What Worth Compressing: Semantically Disentangled Image Coding for Machine with LMMsVisual Communications and Image Processing (VCIP), 2024

Shengyang Zhao

Zhibo Chen

Xin Jin

432

16 Aug 2024

LLMI3D: MLLM-based 3D Perception from a Single 2D Image

Fan Yang

Sicheng Zhao

Yanhao Zhang

Haoxiang Chen

Hui Chen

Wenbo Tang

Guiguang Ding

290

14 Aug 2024

3D-GRES: Generalized 3D Referring Expression Segmentation

Jiayi Ji

316

30 Jul 2024

SegVG: Transferring Object Bounding Box to Segmentation for Visual Grounding

Weitai Kang

Gaowen Liu

Mubarak Shah

Yan Yan

ObjD

460

03 Jul 2024

LLM-Optic: Unveiling the Capabilities of Large Language Models for Universal Visual Grounding

376

27 May 2024

EarthGPT: A Universal Multi-modal Large Language Model for Multi-sensor Image Comprehension in Remote Sensing Domain

Tong Zhang

484

260

30 Jan 2024

GroundVLP: Harnessing Zero-shot Visual Grounding from Vision-Language Pre-training and Open-Vocabulary Object Detection

483

22 Dec 2023

Context Disentangling and Prototype Inheriting for Robust Visual Grounding

Wei Tang

302

19 Dec 2023

Mono3DVG: 3D Visual Grounding in Monocular ImagesAAAI Conference on Artificial Intelligence (AAAI), 2023

Yangfan Zhan

Yuan. Yuan

Zhitong Xiong

MDE

294

13 Dec 2023

Which One? Leveraging Context Between Objects and Multiple Views for Language GroundingNorth American Chapter of the Association for Computational Linguistics (NAACL), 2023

274

12 Nov 2023

Language-guided Robot Grasping: CLIP-based Referring Grasp Synthesis in ClutterConference on Robot Learning (CoRL), 2023

298

09 Nov 2023

RIO: A Benchmark for Reasoning Intention-Oriented Objects in Open EnvironmentsNeural Information Processing Systems (NeurIPS), 2023

Jingkuan Song

258

26 Oct 2023

NICE: Improving Panoptic Narrative Detection and Segmentation with Cascading Collaborative LearningIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023

Jiayi Ji

421

17 Oct 2023

Towards Complex-query Referring Image Segmentation: A Novel Benchmark

Wei Ji

Li Li

Roger Zimmermann

227

29 Sep 2023

Spatial and Visual Perspective-Taking via View Rotation and Relation Reasoning for Embodied Reference UnderstandingEuropean Conference on Computer Vision (ECCV), 2023

Cheng Shi

Sibei Yang

LRM

195

03 Sep 2023

Contrastive Grouping with Transformer for Referring Image SegmentationComputer Vision and Pattern Recognition (CVPR), 2023

427

02 Sep 2023

3D-STMN: Dependency-Driven Superpoint-Text Matching Network for End-to-End 3D Referring Expression SegmentationAAAI Conference on Artificial Intelligence (AAAI), 2023

Qi Chen

Jiayi Ji

299

31 Aug 2023

Described Object Detection: Liberating Object Detection with Flexible ExpressionsNeural Information Processing Systems (NeurIPS), 2023

343

24 Jul 2023

Iterative Robust Visual Grounding with Masked Reference based Centerpoint Supervision

Xiangtai Li

299

23 Jul 2023

CLIP-VG: Self-paced Curriculum Adapting of CLIP for Visual GroundingIEEE transactions on multimedia (IEEE TMM), 2023

Linhui Xiao

Xiaoshan Yang

Fang Peng

Ming Yan

Yaowei Wang

Changsheng Xu

ObjD VLM

543

15 May 2023

Vision-Language Models in Remote Sensing: Current Progress and Future TrendsIEEE Geoscience and Remote Sensing Magazine (GRSM), 2023

Xiao Xiang Zhu

419

187

09 May 2023

Open-vocabulary Object Segmentation with Diffusion ModelsIEEE International Conference on Computer Vision (ICCV), 2023

376

12 Jan 2023

Look Around and Refer: 2D Synthetic Semantics Knowledge Distillation for 3D Visual GroundingNeural Information Processing Systems (NeurIPS), 2022

228

25 Nov 2022

YORO -- Lightweight End to End Visual Grounding

260

15 Nov 2022

VLT: Vision-Language Transformer and Query Generation for Referring SegmentationIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022

370

167

28 Oct 2022

RSVG: Exploring Data and Models for Visual Grounding on Remote Sensing DataIEEE Transactions on Geoscience and Remote Sensing (IEEE TGRS), 2022

Yangfan Zhan

Zhitong Xiong

Yuan. Yuan

279

207

23 Oct 2022

Enhancing Interpretability and Interactivity in Robot Manipulation: A Neurosymbolic Approach

Georgios Tziafas

Hamidreza Kasaei

LM&Ro

462

03 Oct 2022

One for All: One-stage Referring Expression Comprehension with Dynamic ReasoningNeurocomputing (Neurocomputing), 2022

332

31 Jul 2022

DoRO: Disambiguation of referred object for embodied agentsIEEE Robotics and Automation Letters (RA-L), 2022

215

28 Jul 2022

SiRi: A Simple Selective Retraining Mechanism for Transformer-based Visual GroundingEuropean Conference on Computer Vision (ECCV), 2022

Xiaodan Liang

135

27 Jul 2022

TransVG++: End-to-End Visual Grounding with Language Conditioned Vision TransformerIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022

Wanli Ouyang

291

14 Jun 2022

Sim-To-Real Transfer of Visual Grounding for Human-Aided Ambiguity Resolution

Georgios Tziafas

S. Kasaei

333

24 May 2022

Improving Visual Grounding with Visual-Linguistic Verification and Iterative ReasoningComputer Vision and Pattern Recognition (CVPR), 2022

Li Yang

Yan Xu

Chunfen Yuan

Wei Liu

Bing Li

Weiming Hu

ObjD

354

165

30 Apr 2022

A Survivor in the Era of Large-Scale Pretraining: An Empirical Study of One-Stage Referring Expression ComprehensionIEEE transactions on multimedia (IEEE TMM), 2022

279

17 Apr 2022

3D-SPS: Single-Stage 3D Visual Grounding via Referred Point Progressive SelectionComputer Vision and Pattern Recognition (CVPR), 2022

330

134

13 Apr 2022

ReCLIP: A Strong Zero-Shot Baseline for Referring Expression ComprehensionAnnual Meeting of the Association for Computational Linguistics (ACL), 2022

327

169

12 Apr 2022