ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.05622
35
0

CityNavAgent: Aerial Vision-and-Language Navigation with Hierarchical Semantic Planning and Global Memory

8 May 2025
Weichen Zhang
Chen Gao
Shiquan Yu
Ruiying Peng
Baining Zhao
Qian Zhang
Jinqiang Cui
Xinlei Chen
Y. Li
    LLMAG
    LM&Ro
ArXivPDFHTML
Abstract

Aerial vision-and-language navigation (VLN), requiring drones to interpret natural language instructions and navigate complex urban environments, emerges as a critical embodied AI challenge that bridges human-robot interaction, 3D spatial reasoning, and real-world deployment. Although existing ground VLN agents achieved notable results in indoor and outdoor settings, they struggle in aerial VLN due to the absence of predefined navigation graphs and the exponentially expanding action space in long-horizon exploration. In this work, we propose \textbf{CityNavAgent}, a large language model (LLM)-empowered agent that significantly reduces the navigation complexity for urban aerial VLN. Specifically, we design a hierarchical semantic planning module (HSPM) that decomposes the long-horizon task into sub-goals with different semantic levels. The agent reaches the target progressively by achieving sub-goals with different capacities of the LLM. Additionally, a global memory module storing historical trajectories into a topological graph is developed to simplify navigation for visited targets. Extensive benchmark experiments show that our method achieves state-of-the-art performance with significant improvement. Further experiments demonstrate the effectiveness of different modules of CityNavAgent for aerial VLN in continuous city environments. The code is available at \href{this https URL}{link}.

View on arXiv
@article{zhang2025_2505.05622,
  title={ CityNavAgent: Aerial Vision-and-Language Navigation with Hierarchical Semantic Planning and Global Memory },
  author={ Weichen Zhang and Chen Gao and Shiquan Yu and Ruiying Peng and Baining Zhao and Qian Zhang and Jinqiang Cui and Xinlei Chen and Yong Li },
  journal={arXiv preprint arXiv:2505.05622},
  year={ 2025 }
}
Comments on this paper