Beyond Bare Queries: Open-Vocabulary Object Grounding with 3D Scene Graph

Beyond Bare Queries: Open-Vocabulary Object Grounding with 3D Scene Graph

11 June 2024

Svetlana Ladanova

Dmitry A. Yudin

Maxim Monastyrny

Aleksei Valenkov

Papers citing "Beyond Bare Queries: Open-Vocabulary Object Grounding with 3D Scene Graph"

11 / 11 papers shown

Title
DyGEnc: Encoding a Sequence of Textual Scene Graphs to Reason and Answer Questions in Dynamic Scenes S. Linok Vadim Semenov Anastasia Trunova Oleg Bulichev Dmitry A. Yudin 37 0 0 06 May 2025
TeDA: Boosting Vision-Lanuage Models for Zero-Shot 3D Object Retrieval via Testing-time Distribution Alignment Z. Wang Yang Zhou Jinhai Xiang Y. Wang Xinwei He VLM 37 0 0 05 May 2025
Scene-LLM: Extending Language Model for 3D Visual Understanding and Reasoning Rao Fu Jingyu Liu Xilun Chen Yixin Nie Wenhan Xiong LM&Ro LRM 39 47 0 18 Mar 2024
EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters Quan-Sen Sun Jinsheng Wang Qiying Yu Yufeng Cui Fan Zhang Xiaosong Zhang Xinlong Wang VLM CLIP MLLM 81 38 0 06 Feb 2024
Chat-3D v2: Bridging 3D Scene and Large Language Models with Object Identifiers Haifeng Huang Zehan Wang Rongjie Huang Luping Liu Xize Cheng Yang Zhao Tao Jin Zhou Zhao 50 40 0 13 Dec 2023
Incremental 3D Semantic Scene Graph Prediction from RGB Sequences Shun-cheng Wu Keisuke Tateno Nassir Navab F. Tombari 3DPC 3DV 35 20 0 04 May 2023
Open-Vocabulary Panoptic Segmentation with Text-to-Image Diffusion Models Jiarui Xu Sifei Liu Arash Vahdat Wonmin Byeon Xiaolong Wang Shalini De Mello VLM 198 318 0 08 Mar 2023
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models Junnan Li Dongxu Li Silvio Savarese Steven C. H. Hoi VLM MLLM 244 4,186 0 30 Jan 2023
Mask3D: Mask Transformer for 3D Semantic Instance Segmentation Jonas Schult Francis Engelmann Alexander Hermans Or Litany Siyu Tang Bastian Leibe ISeg 50 164 0 06 Oct 2022
Emerging Properties in Self-Supervised Vision Transformers Mathilde Caron Hugo Touvron Ishan Misra Hervé Jégou Julien Mairal Piotr Bojanowski Armand Joulin 283 5,723 0 29 Apr 2021
Open-vocabulary Object Detection via Vision and Language Knowledge Distillation Xiuye Gu Tsung-Yi Lin Weicheng Kuo Yin Cui VLM ObjD 220 698 0 28 Apr 2021