Air-Ground Collaboration for Language-Specified Missions in Unknown Environments

As autonomous robotic systems become increasingly mature, users will want to specify missions at the level of intent rather than in low-level detail. Language is an expressive and intuitive medium for such mission specification. However, realizing language-guided robotic teams requires overcoming significant technical hurdles. Interpreting and realizing language-specified missions requires advanced semantic reasoning. Successful heterogeneous robots must effectively coordinate actions and share information across varying viewpoints. Additionally, communication between robots is typically intermittent, necessitating robust strategies that leverage communication opportunities to maintain coordination and achieve mission objectives. In this work, we present a first-of-its-kind system where an unmanned aerial vehicle (UAV) and an unmanned ground vehicle (UGV) are able to collaboratively accomplish missions specified in natural language while reacting to changes in specification on the fly. We leverage a Large Language Model (LLM)-enabled planner to reason over semantic-metric maps that are built online and opportunistically shared between an aerial and a ground robot. We consider task-driven navigation in urban and rural areas. Our system must infer mission-relevant semantics and actively acquire information via semantic mapping. In both ground and air-ground teaming experiments, we demonstrate our system on seven different natural-language specifications at up to kilometer-scale navigation.
View on arXiv@article{cladera2025_2505.09108, title={ Air-Ground Collaboration for Language-Specified Missions in Unknown Environments }, author={ Fernando Cladera and Zachary Ravichandran and Jason Hughes and Varun Murali and Carlos Nieto-Granda and M. Ani Hsieh and George J. Pappas and Camillo J. Taylor and Vijay Kumar }, journal={arXiv preprint arXiv:2505.09108}, year={ 2025 } }