164

FastViDAR: Real-Time Omnidirectional Depth Estimation via Alternative Hierarchical Attention

Main:7 Pages
7 Figures
Bibliography:1 Pages
4 Tables
Abstract

In this paper we propose FastViDAR, a novel framework that takes four fisheye camera inputs and produces a full 360360^\circ depth map along with per-camera depth, fusion depth, and confidence estimates. Our main contributions are: (1) We introduce Alternative Hierarchical Attention (AHA) mechanism that efficiently fuses features across views through separate intra-frame and inter-frame windowed self-attention, achieving cross-view feature mixing with reduced overhead. (2) We propose a novel ERP fusion approach that projects multi-view depth estimates to a shared equirectangular coordinate system to obtain the final fusion depth. (3) We generate ERP image-depth pairs using HM3D and 2D3D-S datasets for comprehensive evaluation, demonstrating competitive zero-shot performance on real datasets while achieving up to 20 FPS on NVIDIA Orin NX embedded hardware. Project page: \href{this https URL}{this https URL}

View on arXiv
Comments on this paper