443
v1v2v3 (latest)

LLM-Craft: Robotic Crafting of Elasto-Plastic Objects with Large Language Models

Main:6 Pages
9 Figures
Bibliography:1 Pages
Abstract

When humans create sculptures, we are able to reason about how geometrically we need to alter the clay state to reach our target goal. We are not computing point-wise similarity metrics, or reasoning about low-level positioning of our tools, but instead determining the higher-level changes that need to be made. In this work, we propose LLM-Craft, a novel pipeline that leverages large language models (LLMs) to iteratively reason about and generate deformation-based crafting action sequences. We simplify and couple the state and action representations to further encourage shape-based reasoning. To the best of our knowledge, LLM-Craft is the first system successfully leveraging LLMs for complex deformable object interactions. Through our experiments, we demonstrate that with the LLM-Craft framework, LLMs are able to successfully create a set of simple letter shapes. We explore a variety of rollout strategies, and compare performances of LLM-Craft variants with and without an explicit goal shape images. For videos and prompting details, please visit our project website:this https URL

View on arXiv
Comments on this paper