Learn from the Past: Language-conditioned Object Rearrangement with Large Language Models

Object manipulation for rearrangement into a specific goal state is a significant task for collaborative robots. Accurately determining object placement is a key challenge, as misalignment can increase task complexity and the risk of collisions, affecting the efficiency of the rearrangement process. Most current methods heavily rely on pre-collected datasets to train the model for predicting the goal position. As a result, these methods are restricted to specific instructions, which limits their broader applicability and generalisation. In this paper, we propose a framework of flexible language-conditioned object rearrangement based on the Large Language Model (LLM). Our approach mimics human reasoning by making use of successful past experiences as a reference to infer the best strategies to achieve a current desired goal position. Based on LLM's strong natural language comprehension and inference ability, our method generalises to handle various everyday objects and free-form language instructions in a zero-shot manner. Experimental results demonstrate that our methods can effectively execute the robotic rearrangement tasks, even those involving long sequences of orders.
View on arXiv@article{cao2025_2501.18516, title={ Learn from the Past: Language-conditioned Object Rearrangement with Large Language Models }, author={ Guanqun Cao and Ryan Mckenna and Erich Graf and John Oyekan }, journal={arXiv preprint arXiv:2501.18516}, year={ 2025 } }