Bidirectional Task-Motion Planning Based on Hierarchical Reinforcement Learning for Strategic Confrontation

In swarm robotics, confrontation scenarios, including strategic confrontations, require efficient decision-making that integrates discrete commands and continuous actions. Traditional task and motion planning methods separate decision-making into two layers, but their unidirectional structure fails to capture the interdependence between these layers, limiting adaptability in dynamic environments. Here, we propose a novel bidirectional approach based on hierarchical reinforcement learning, enabling dynamic interaction between the layers. This method effectively maps commands to task allocation and actions to path planning, while leveraging cross-training techniques to enhance learning across the hierarchical framework. Furthermore, we introduce a trajectory prediction model that bridges abstract task representations with actionable planning goals. In our experiments, it achieves over 80% in confrontation win rate and under 0.01 seconds in decision time, outperforming existing approaches. Demonstrations through large-scale tests and real-world robot experiments further emphasize the generalization capabilities and practical applicability of our method.
View on arXiv@article{wu2025_2504.15876, title={ Bidirectional Task-Motion Planning Based on Hierarchical Reinforcement Learning for Strategic Confrontation }, author={ Qizhen Wu and Lei Chen and Kexin Liu and Jinhu Lü }, journal={arXiv preprint arXiv:2504.15876}, year={ 2025 } }