ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2207.11514
  4. Cited By
Semantic Abstraction: Open-World 3D Scene Understanding from 2D
  Vision-Language Models

Semantic Abstraction: Open-World 3D Scene Understanding from 2D Vision-Language Models

23 July 2022
Huy Ha
Shuran Song
    LM&Ro
    VLM
ArXivPDFHTML

Papers citing "Semantic Abstraction: Open-World 3D Scene Understanding from 2D Vision-Language Models"

50 / 85 papers shown
Title
AS3D: 2D-Assisted Cross-Modal Understanding with Semantic-Spatial Scene Graphs for 3D Visual Grounding
AS3D: 2D-Assisted Cross-Modal Understanding with Semantic-Spatial Scene Graphs for 3D Visual Grounding
Feng Xiao
Hongbin Xu
Guocan Zhao
Wenxiong Kang
37
0
0
07 May 2025
WIR3D: Visually-Informed and Geometry-Aware 3D Shape Abstraction
WIR3D: Visually-Informed and Geometry-Aware 3D Shape Abstraction
Richard Liu
Daniel Fu
Noah Tan
Itai Lang
Rana Hanocka
3DH
43
0
0
07 May 2025
GrabS: Generative Embodied Agent for 3D Object Segmentation without Scene Supervision
GrabS: Generative Embodied Agent for 3D Object Segmentation without Scene Supervision
Zihui Zhang
Yafei Yang
Hongtao Wen
Bo Yang
3DPC
30
0
0
16 Apr 2025
UKBOB: One Billion MRI Labeled Masks for Generalizable 3D Medical Image Segmentation
UKBOB: One Billion MRI Labeled Masks for Generalizable 3D Medical Image Segmentation
Emmanuelle Bourigault
A. Jamaludin
Abdullah Hamdi
21
0
0
09 Apr 2025
Unveiling the Mist over 3D Vision-Language Understanding: Object-centric Evaluation with Chain-of-Analysis
Unveiling the Mist over 3D Vision-Language Understanding: Object-centric Evaluation with Chain-of-Analysis
J. Huang
Baoxiong Jia
Y. Wang
Ziyu Zhu
Xiongkun Linghu
Qing Li
Song-Chun Zhu
Siyuan Huang
75
3
0
28 Mar 2025
Cross-Modal and Uncertainty-Aware Agglomeration for Open-Vocabulary 3D Scene Understanding
Cross-Modal and Uncertainty-Aware Agglomeration for Open-Vocabulary 3D Scene Understanding
Jinlong Li
Cristiano Saltori
Fabio Poiesi
N. Sebe
69
0
0
20 Mar 2025
OSMa-Bench: Evaluating Open Semantic Mapping Under Varying Lighting Conditions
Maxim Popov
Regina Kurkova
Mikhail Iumanov
Jaafar Mahmoud
Sergey Kolyubin
34
0
0
13 Mar 2025
Predicate Hierarchies Improve Few-Shot State Classification
Predicate Hierarchies Improve Few-Shot State Classification
Emily Jin
Joy Hsu
Jiajun Wu
OffRL
67
0
0
18 Feb 2025
GPT4Scene: Understand 3D Scenes from Videos with Vision-Language Models
GPT4Scene: Understand 3D Scenes from Videos with Vision-Language Models
Zhangyang Qi
Zhixiong Zhang
Ye Fang
Jiaqi Wang
Hengshuang Zhao
83
6
0
02 Jan 2025
Data-Efficient Inference of Neural Fluid Fields via SciML Foundation
  Model
Data-Efficient Inference of Neural Fluid Fields via SciML Foundation Model
Yuqiu Liu
Jingxuan Xu
Mauricio Soroco
Yunchao Wei
Wuyang Chen
AI4CE
65
2
0
18 Dec 2024
RelationField: Relate Anything in Radiance Fields
RelationField: Relate Anything in Radiance Fields
Sebastian Koch
Johanna Wald
Mirco Colosi
Narunas Vaskevicius
Pedro Hermosilla
F. Tombari
Timo Ropinski
102
1
0
18 Dec 2024
Occam's LGS: An Efficient Approach for Language Gaussian Splatting
Occam's LGS: An Efficient Approach for Language Gaussian Splatting
Jiahuan Cheng
Jan-Nico Zaech
Luc Van Gool
Danda Pani Paudel
3DGS
73
0
0
02 Dec 2024
ROOT: VLM based System for Indoor Scene Understanding and Beyond
ROOT: VLM based System for Indoor Scene Understanding and Beyond
Yonghui Wang
Shi-Yong Chen
Zhenxing Zhou
Siyi Li
Haoran Li
Wengang Zhou
H. Li
VLM
64
3
0
24 Nov 2024
XMask3D: Cross-modal Mask Reasoning for Open Vocabulary 3D Semantic
  Segmentation
XMask3D: Cross-modal Mask Reasoning for Open Vocabulary 3D Semantic Segmentation
Ziyi Wang
Y. Wang
Xumin Yu
Jie Zhou
Jiwen Lu
62
0
0
20 Nov 2024
Multimodality Helps Few-shot 3D Point Cloud Semantic Segmentation
Multimodality Helps Few-shot 3D Point Cloud Semantic Segmentation
Zhaochong An
Guolei Sun
Yun Liu
Runjia Li
Min Wu
Ming-Ming Cheng
Ender Konukoglu
Serge J. Belongie
52
4
0
29 Oct 2024
Structured Spatial Reasoning with Open Vocabulary Object Detectors
Structured Spatial Reasoning with Open Vocabulary Object Detectors
Negar Nejatishahidin
Madhukar Reddy Vongala
Jana Kosecka
22
2
0
09 Oct 2024
In-Place Panoptic Radiance Field Segmentation with Perceptual Prior for
  3D Scene Understanding
In-Place Panoptic Radiance Field Segmentation with Perceptual Prior for 3D Scene Understanding
Shenghao Li
22
1
0
06 Oct 2024
Robot Navigation Using Physically Grounded Vision-Language Models in
  Outdoor Environments
Robot Navigation Using Physically Grounded Vision-Language Models in Outdoor Environments
Mohamed Bashir Elnoor
K. Weerakoon
Gershom Seneviratne
Ruiqi Xian
Tianrui Guan
Mohamed Khalid M Jaffar
Vignesh Rajagopal
Dinesh Manocha
18
2
0
30 Sep 2024
BehAV: Behavioral Rule Guided Autonomy Using VLMs for Robot Navigation
  in Outdoor Scenes
BehAV: Behavioral Rule Guided Autonomy Using VLMs for Robot Navigation in Outdoor Scenes
K. Weerakoon
Mohamed Bashir Elnoor
Gershom Seneviratne
Vignesh Rajagopal
Senthil Hariharan Arul
Jing Liang
Mohamed Khalid M Jaffar
Dinesh Manocha
LM&Ro
35
7
0
24 Sep 2024
From Words to Poses: Enhancing Novel Object Pose Estimation with Vision
  Language Models
From Words to Poses: Enhancing Novel Object Pose Estimation with Vision Language Models
Tessa Pulli
Stefan Thalhammer
Simon Schwaiger
Markus Vincze
LM&Ro
35
0
0
09 Sep 2024
Zero-Shot Dual-Path Integration Framework for Open-Vocabulary 3D
  Instance Segmentation
Zero-Shot Dual-Path Integration Framework for Open-Vocabulary 3D Instance Segmentation
Tri Ton
Ji Woo Hong
Soohwan Eom
Jun Yeop Shim
Junyeong Kim
Chang D. Yoo
3DPC
ISeg
30
2
0
16 Aug 2024
VL-TGS: Trajectory Generation and Selection using Vision Language Models in Mapless Outdoor Environments
VL-TGS: Trajectory Generation and Selection using Vision Language Models in Mapless Outdoor Environments
Daeun Song
Jing Liang
Xuesu Xiao
Dinesh Manocha
44
4
0
05 Aug 2024
Improving 2D Feature Representations by 3D-Aware Fine-Tuning
Improving 2D Feature Representations by 3D-Aware Fine-Tuning
Yuanwen Yue
Anurag Das
Francis Engelmann
Siyu Tang
J. E. Lenssen
38
23
0
29 Jul 2024
DKPROMPT: Domain Knowledge Prompting Vision-Language Models for
  Open-World Planning
DKPROMPT: Domain Knowledge Prompting Vision-Language Models for Open-World Planning
Xiaohan Zhang
Zainab Altaweel
Yohei Hayamizu
Yan Ding
S. Amiri
Hao Yang
Andy Kaminski
Chad Esselink
Shiqi Zhang
VLM
LM&Ro
31
6
0
25 Jun 2024
Beyond Bare Queries: Open-Vocabulary Object Grounding with 3D Scene Graph
Beyond Bare Queries: Open-Vocabulary Object Grounding with 3D Scene Graph
S. Linok
T. Zemskova
Svetlana Ladanova
Roman Titkov
Dmitry A. Yudin
Maxim Monastyrny
Aleksei Valenkov
LM&Ro
43
0
0
11 Jun 2024
InsightSee: Advancing Multi-agent Vision-Language Models for Enhanced
  Visual Understanding
InsightSee: Advancing Multi-agent Vision-Language Models for Enhanced Visual Understanding
Huaxiang Zhang
Yaojia Mu
Guo-Niu Zhu
Zhongxue Gan
34
2
0
31 May 2024
Unifying 3D Vision-Language Understanding via Promptable Queries
Unifying 3D Vision-Language Understanding via Promptable Queries
Ziyu Zhu
Zhuofan Zhang
Xiaojian Ma
Xuesong Niu
Yixin Chen
Baoxiong Jia
Zhidong Deng
Siyuan Huang
Qing Li
40
21
0
19 May 2024
When LLMs step into the 3D World: A Survey and Meta-Analysis of 3D Tasks
  via Multi-modal Large Language Models
When LLMs step into the 3D World: A Survey and Meta-Analysis of 3D Tasks via Multi-modal Large Language Models
Xianzheng Ma
Yash Bhalgat
Brandon Smart
Shuai Chen
Xinghui Li
...
Matthias Nießner
Ian D Reid
Angel X. Chang
Iro Laina
V. Prisacariu
LRM
29
11
0
16 May 2024
Clio: Real-time Task-Driven Open-Set 3D Scene Graphs
Clio: Real-time Task-Driven Open-Set 3D Scene Graphs
Dominic Maggio
Yun Chang
Nathan Hughes
Matthew Trang
Dan Griffith
Carlyn Dougherty
Eric Cristofalo
Lukas Schmid
Luca Carlone
3DV
33
31
0
21 Apr 2024
Unified Scene Representation and Reconstruction for 3D Large Language
  Models
Unified Scene Representation and Reconstruction for 3D Large Language Models
Tao Chu
Pan Zhang
Xiao-wen Dong
Yuhang Zang
Qiong Liu
Jiaqi Wang
18
1
0
19 Apr 2024
OpenNeRF: Open Set 3D Neural Scene Segmentation with Pixel-Wise Features
  and Rendered Novel Views
OpenNeRF: Open Set 3D Neural Scene Segmentation with Pixel-Wise Features and Rendered Novel Views
Francis Engelmann
Fabian Manhardt
Michael Niemeyer
Keisuke Tateno
Marc Pollefeys
Federico Tombari
VLM
59
32
1
04 Apr 2024
Segment Any 3D Object with Language
Segment Any 3D Object with Language
Seungjun Lee
Yuyang Zhao
Gim Hee Lee
31
1
0
02 Apr 2024
Explore until Confident: Efficient Exploration for Embodied Question
  Answering
Explore until Confident: Efficient Exploration for Embodied Question Answering
Allen Z. Ren
Jaden Clark
Anushri Dixit
Masha Itkina
Anirudha Majumdar
Dorsa Sadigh
40
28
0
23 Mar 2024
CoNVOI: Context-aware Navigation using Vision Language Models in Outdoor
  and Indoor Environments
CoNVOI: Context-aware Navigation using Vision Language Models in Outdoor and Indoor Environments
A. Sathyamoorthy
K. Weerakoon
Mohamed Bashir Elnoor
Anuj Zore
Brian Ichter
Fei Xia
Jie Tan
Wenhao Yu
Dinesh Manocha
LM&Ro
43
17
0
22 Mar 2024
Semantic Gaussians: Open-Vocabulary Scene Understanding with 3D Gaussian
  Splatting
Semantic Gaussians: Open-Vocabulary Scene Understanding with 3D Gaussian Splatting
Jun Guo
Xiaojian Ma
Yue Fan
Huaping Liu
Qing Li
3DGS
36
26
0
22 Mar 2024
N2F2: Hierarchical Scene Understanding with Nested Neural Feature Fields
N2F2: Hierarchical Scene Understanding with Nested Neural Feature Fields
Yash Bhalgat
Iro Laina
João F. Henriques
Andrew Zisserman
Andrea Vedaldi
41
14
0
16 Mar 2024
TAMM: TriAdapter Multi-Modal Learning for 3D Shape Understanding
TAMM: TriAdapter Multi-Modal Learning for 3D Shape Understanding
Zhihao Zhang
Shengcao Cao
Yu-Xiong Wang
24
14
0
28 Feb 2024
Open3DSG: Open-Vocabulary 3D Scene Graphs from Point Clouds with
  Queryable Objects and Open-Set Relationships
Open3DSG: Open-Vocabulary 3D Scene Graphs from Point Clouds with Queryable Objects and Open-Set Relationships
Sebastian Koch
Narunas Vaskevicius
Mirco Colosi
Pedro Hermosilla
Timo Ropinski
3DPC
25
25
0
19 Feb 2024
OK-Robot: What Really Matters in Integrating Open-Knowledge Models for
  Robotics
OK-Robot: What Really Matters in Integrating Open-Knowledge Models for Robotics
Peiqi Liu
Yaswanth Orru
Jay Vakil
Chris Paxton
Nur Muhammad (Mahi) Shafiullah
Lerrel Pinto
LM&Ro
VLM
86
27
0
22 Jan 2024
ODIN: A Single Model for 2D and 3D Segmentation
ODIN: A Single Model for 2D and 3D Segmentation
Ayush Jain
Pushkal Katara
N. Gkanatsios
Adam W. Harley
Gabriel H. Sarch
Kriti Aggarwal
Vishrav Chaudhary
Katerina Fragkiadaki
3DPC
32
7
0
04 Jan 2024
3D Open-Vocabulary Panoptic Segmentation with 2D-3D Vision-Language
  Distillation
3D Open-Vocabulary Panoptic Segmentation with 2D-3D Vision-Language Distillation
Zihao Xiao
Longlong Jing
Shangxuan Wu
Alex Zihao Zhu
Jingwei Ji
...
Thomas Funkhouser
Weicheng Kuo
A. Angelova
Yin Zhou
Shiwei Sheng
VLM
23
5
0
04 Jan 2024
Segment3D: Learning Fine-Grained Class-Agnostic 3D Segmentation without
  Manual Labels
Segment3D: Learning Fine-Grained Class-Agnostic 3D Segmentation without Manual Labels
Rui Huang
Songyou Peng
Ayca Takmaz
Federico Tombari
Marc Pollefeys
Shiji Song
Gao Huang
Francis Engelmann
VLM
8
37
0
28 Dec 2023
Foundation Models in Robotics: Applications, Challenges, and the Future
Foundation Models in Robotics: Applications, Challenges, and the Future
Roya Firoozi
Johnathan Tucker
Stephen Tian
Anirudha Majumdar
Jiankai Sun
...
Brian Ichter
Danny Driess
Jiajun Wu
Cewu Lu
Mac Schwager
LM&Ro
AI4CE
LRM
VLM
33
136
0
13 Dec 2023
Dream2Real: Zero-Shot 3D Object Rearrangement with Vision-Language
  Models
Dream2Real: Zero-Shot 3D Object Rearrangement with Vision-Language Models
Ivan Kapelyukh
Yifei Ren
Ignacio Alzugaray
Edward Johns
VLM
LM&Ro
15
20
0
07 Dec 2023
Segment Any 3D Gaussians
Segment Any 3D Gaussians
Jiazhong Cen
Jiemin Fang
Chen Yang
Lingxi Xie
Xiaopeng Zhang
Wei Shen
Qi Tian
3DGS
51
66
0
01 Dec 2023
3D Paintbrush: Local Stylization of 3D Shapes with Cascaded Score
  Distillation
3D Paintbrush: Local Stylization of 3D Shapes with Cascaded Score Distillation
Dale Decatur
Itai Lang
Kfir Aberman
Rana Hanocka
22
16
0
16 Nov 2023
VioLA: Aligning Videos to 2D LiDAR Scans
VioLA: Aligning Videos to 2D LiDAR Scans
Jun-Jee Chao
Selim Engin
Nikhil Chavan-Dafle
Bhoram Lee
Volkan Isler
VGen
14
0
0
08 Nov 2023
Large Language Models as Generalizable Policies for Embodied Tasks
Large Language Models as Generalizable Policies for Embodied Tasks
Andrew Szot
Max Schwarzer
Harsh Agrawal
Bogdan Mazoure
Walter A. Talbott
Katherine Metcalf
Natalie Mackraz
Devon Hjelm
Alexander Toshev
LM&Ro
8
57
0
26 Oct 2023
ViT-A*: Legged Robot Path Planning using Vision Transformer A*
ViT-A*: Legged Robot Path Planning using Vision Transformer A*
Jianwei Liu
Shirui Lyu
Denis Hadjivelichkov
Valerio Modugno
Dimitrios Kanoulas
21
8
0
11 Oct 2023
Compositional Semantics for Open Vocabulary Spatio-semantic
  Representations
Compositional Semantics for Open Vocabulary Spatio-semantic Representations
Robin Karlsson
Francisco Lepe-Salazar
K. Takeda
VLM
40
1
0
08 Oct 2023
12
Next