ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2403.14594
17
5

VXP: Voxel-Cross-Pixel Large-scale Image-LiDAR Place Recognition

21 March 2024
Yun-Jin Li
M. Gladkova
Yan Xia
Rui Wang
Daniel Cremers
ArXivPDFHTML
Abstract

Cross-modal place recognition methods are flexible GPS-alternatives under varying environment conditions and sensor setups. However, this task is non-trivial since extracting consistent and robust global descriptors from different modalities is challenging. To tackle this issue, we propose Voxel-Cross-Pixel (VXP), a novel camera-to-LiDAR place recognition framework that enforces local similarities in a self-supervised manner and effectively brings global context from images and LiDAR scans into a shared feature space. Specifically, VXP is trained in three stages: first, we deploy a visual transformer to compactly represent input images. Secondly, we establish local correspondences between image-based and point cloud-based feature spaces using our novel geometric alignment module. We then aggregate local similarities into an expressive shared latent space. Extensive experiments on the three benchmarks (Oxford RobotCar, ViViD++ and KITTI) demonstrate that our method surpasses the state-of-the-art cross-modal retrieval by a large margin. Our evaluations show that the proposed method is accurate, efficient and light-weight. Our project page is available at:this https URL

View on arXiv
@article{li2025_2403.14594,
  title={ VXP: Voxel-Cross-Pixel Large-scale Image-LiDAR Place Recognition },
  author={ Yun-Jin Li and Mariia Gladkova and Yan Xia and Rui Wang and Daniel Cremers },
  journal={arXiv preprint arXiv:2403.14594},
  year={ 2025 }
}
Comments on this paper