ViewRefer: Grasp the Multi-view Knowledge for 3D Visual Grounding with
GPT and Prototype Guidance

ViewRefer: Grasp the Multi-view Knowledge for 3D Visual Grounding with GPT and Prototype Guidance

29 March 2023

Xuelong Li

Papers citing "ViewRefer: Grasp the Multi-view Knowledge for 3D Visual Grounding with GPT and Prototype Guidance"

6 / 56 papers shown

Title
Masked Autoencoders Are Scalable Vision Learners Kaiming He Xinlei Chen Saining Xie Yanghao Li Piotr Dollár Ross B. Girshick ViT TPM 258 7,337 0 11 Nov 2021
Learning to Prompt for Vision-Language Models Kaiyang Zhou Jingkang Yang Chen Change Loy Ziwei Liu VPVLM CLIP VLM 322 2,108 0 02 Sep 2021
Patch-NetVLAD: Multi-Scale Fusion of Locally-Global Descriptors for Place Recognition Stephen Hausler Sourav Garg Ming Xu Michael Milford Tobias Fischer 39 328 0 02 Mar 2021
InstanceRefer: Cooperative Holistic Understanding for Visual Grounding on Point Clouds through Instance Multi-level Contextual Referring Zhihao Yuan Xu Yan Yinghong Liao Ruimao Zhang Sheng Wang Zhen Li Shuguang Cui 59 128 0 01 Mar 2021
Self-Supervised Pretraining of 3D Features on any Point-Cloud Zaiwei Zhang Rohit Girdhar Armand Joulin Ishan Misra 3DPC 120 235 0 07 Jan 2021
PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation C. Qi Hao Su Kaichun Mo Leonidas J. Guibas 3DH 3DPC 3DV PINN 210 13,886 0 02 Dec 2016