ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2504.06553
23
0

ASHiTA: Automatic Scene-grounded HIerarchical Task Analysis

9 April 2025
Yun Chang
Leonor Fermoselle
Duy Ta
Bernadette Bucher
Luca Carlone
Jiuguang Wang
ArXivPDFHTML
Abstract

While recent work in scene reconstruction and understanding has made strides in grounding natural language to physical 3D environments, it is still challenging to ground abstract, high-level instructions to a 3D scene. High-level instructions might not explicitly invoke semantic elements in the scene, and even the process of breaking a high-level task into a set of more concrete subtasks, a process called hierarchical task analysis, is environment-dependent. In this work, we propose ASHiTA, the first framework that generates a task hierarchy grounded to a 3D scene graph by breaking down high-level tasks into grounded subtasks. ASHiTA alternates LLM-assisted hierarchical task analysis, to generate the task breakdown, with task-driven 3D scene graph construction to generate a suitable representation of the environment. Our experiments show that ASHiTA performs significantly better than LLM baselines in breaking down high-level tasks into environment-dependent subtasks and is additionally able to achieve grounding performance comparable to state-of-the-art methods.

View on arXiv
@article{chang2025_2504.06553,
  title={ ASHiTA: Automatic Scene-grounded HIerarchical Task Analysis },
  author={ Yun Chang and Leonor Fermoselle and Duy Ta and Bernadette Bucher and Luca Carlone and Jiuguang Wang },
  journal={arXiv preprint arXiv:2504.06553},
  year={ 2025 }
}
Comments on this paper