UAV-VLRR: Vision-Language Informed NMPC for Rapid Response in UAV Search and Rescue

4 March 2025

Abstract

Emergency search and rescue (SAR) operations often require rapid and precise target identification in complex environments where traditional manual drone control is inefficient. In order to address these scenarios, a rapid SAR system, UAV-VLRR (Vision-Language-Rapid-Response), is developed in this research. This system consists of two aspects: 1) A multimodal system which harnesses the power of Visual Language Model (VLM) and the natural language processing capabilities of ChatGPT-4o (LLM) for scene interpretation. 2) A non-linearmodel predictive control (NMPC) with built-in obstacle avoidance for rapid response by a drone to fly according to the output of the multimodal system. This work aims at improving response times in emergency SAR operations by providing a more intuitive and natural approach to the operator to plan the SAR mission while allowing the drone to carry out that mission in a rapid and safe manner. When tested, our approach was faster on an average by 33.75% when compared with an off-the-shelf autopilot and 54.6% when compared with a human pilot. Video of UAV-VLRR:this https URL

View on arXiv

@article{yaqoot2025_2503.02465,
  title={ UAV-VLRR: Vision-Language Informed NMPC for Rapid Response in UAV Search and Rescue },
  author={ Yasheerah Yaqoot and Muhammad Ahsan Mustafa and Oleg Sautenkov and Artem Lykov and Valerii Serpiva and Dzmitry Tsetserukou },
  journal={arXiv preprint arXiv:2503.02465},
  year={ 2025 }
}

Comments on this paper