v1v2 (latest)

Enhancing Safety of Foundation Models for Visual Navigation through Collision Avoidance via Repulsive Estimation

4 June 2025

Main:8 Pages

8 Figures

Bibliography:3 Pages

6 Tables

Appendix:5 Pages

Abstract

We propose CARE (Collision Avoidance via Repulsive Estimation), a plug-and-play module that enhances the safety of vision-based navigation without requiring additional range sensors or fine-tuning of pretrained models. While recent foundation models using only RGB inputs have shown strong performance, they often fail to generalize in out-of-distribution (OOD) environments with unseen objects or variations in camera parameters (e.g., field of view, pose, or focal length). Without fine-tuning, these models may generate unsafe trajectories that lead to collisions, requiring costly data collection and retraining. CARE addresses this limitation by seamlessly integrating with any RGB-based navigation system that outputs local trajectories, dynamically adjusting them using repulsive force vectors derived from monocular depth maps. We evaluate CARE by combining it with state-of-the-art vision-based navigation models across multiple robot platforms. CARE consistently reduces collision rates (up to 100%) without sacrificing goal-reaching performance and improves collision-free travel distance by up to 10.7x in exploration tasks.

View on arXiv

@article{kim2025_2506.03834,
  title={ Enhancing Safety of Foundation Models for Visual Navigation through Collision Avoidance via Repulsive Estimation },
  author={ Joonkyung Kim and Joonyeol Sim and Woojun Kim and Katia Sycara and Changjoo Nam },
  journal={arXiv preprint arXiv:2506.03834},
  year={ 2025 }
}

Comments on this paper