LLM-Based Agentic Exploration for Robot Navigation & Manipulation with Skill Orchestration

2 January 2026

Abu Hanif Muhammad Syarubany

Farhan Zaki Rahmani

Trio Widianto

ArXiv (abs)PDF HTML Github (50698★)

Main:8 Pages

11 Figures

Bibliography:1 Pages

1 Tables

Abstract

This paper presents an end-to-end LLM-based agentic exploration system for an indoor shopping task, evaluated in both Gazebo simulation and a corresponding real-world corridor layout. The robot incrementally builds a lightweight semantic map by detecting signboards at junctions and storing direction-to-POI relations together with estimated junction poses, while AprilTags provide repeatable anchors for approach and alignment. Given a natural-language shopping request, an LLM produces a constrained discrete action at each junction (direction and whether to enter a store), and a ROS finite-state main controller executes the decision by gating modular motion primitives, including local-costmap-based obstacle avoidance, AprilTag approaching, store entry, and grasping. Qualitative results show that the integrated stack can perform end-to-end task execution from user instruction to multi-store navigation and object retrieval, while remaining modular and debuggable through its text-based map and logged decision history.

View on arXiv

Comments on this paper