32
0

Efficient End-to-end Visual Localization for Autonomous Driving with Decoupled BEV Neural Matching

Abstract

Accurate localization plays an important role in high-level autonomous driving systems. Conventional map matching-based localization methods solve the poses by explicitly matching map elements with sensor observations, generally sensitive to perception noise, therefore requiring costly hyper-parameter tuning. In this paper, we propose an end-to-end localization neural network which directly estimates vehicle poses from surrounding images, without explicitly matching perception results with HD maps. To ensure efficiency and interpretability, a decoupled BEV neural matching-based pose solver is proposed, which estimates poses in a differentiable sampling-based matching module. Moreover, the sampling space is hugely reduced by decoupling the feature representation affected by each DoF of poses. The experimental results demonstrate that the proposed network is capable of performing decimeter level localization with mean absolute errors of 0.19m, 0.13m and 0.39 degree in longitudinal, lateral position and yaw angle while exhibiting a 68.8% reduction in inference memory usage.

View on arXiv
@article{miao2025_2503.00862,
  title={ Efficient End-to-end Visual Localization for Autonomous Driving with Decoupled BEV Neural Matching },
  author={ Jinyu Miao and Tuopu Wen and Ziang Luo and Kangan Qian and Zheng Fu and Yunlong Wang and Kun Jiang and Mengmeng Yang and Jin Huang and Zhihua Zhong and Diange Yang },
  journal={arXiv preprint arXiv:2503.00862},
  year={ 2025 }
}
Comments on this paper