Rectifying Geometry-Induced Similarity Distortions for Real-World Aerial-Ground Person Re-Identification

29 January 2026

Kailash A. Hambarde

Hugo Proença

ArXiv (abs)PDF HTML Github

Main:9 Pages

11 Figures

Bibliography:3 Pages

1 Tables

Abstract

Aerial-ground person re-identification (AG-ReID) is fundamentally challenged by extreme viewpoint and distance discrepancies between aerial and ground cameras, which induce severe geometric distortions and invalidate the assumption of a shared similarity space across views. Existing methods primarily rely on geometry-aware feature learning or appearance-conditioned prompting, while implicitly assuming that the geometry-invariant dot-product similarity used in attention mechanisms remains reliable under large viewpoint and scale variations. We argue that this assumption does not hold. Extreme camera geometry systematically distorts the query-key similarity space and degrades attention-based matching, even when feature representations are partially aligned.

View on arXiv

Comments on this paper