339

Attention, Please! Revisiting Attentive Probing for Masked Image Modeling

Main:9 Pages
19 Figures
Bibliography:5 Pages
5 Tables
Appendix:14 Pages
Abstract

As fine-tuning (FT) becomes increasingly impractical at scale, probing is emerging as the preferred evaluation protocol for self-supervised learning (SSL). Yet, the standard linear probing (LP) fails to adequately reflect the potential of models trained with Masked Image Modeling (MIM), due to the distributed nature of patch tokens. This motivates the need for attentive probing, an alternative that uses attention to selectively aggregate patch-level features. Despite its growing adoption, attentive probing remains under-explored, with existing methods suffering from excessive parameterization and poor computational efficiency.

View on arXiv
Comments on this paper