v1v2 (latest)

AdaDim: Dimensionality Adaptation for SSL Representational Dynamics

18 May 2025

Main:10 Pages

20 Figures

Bibliography:3 Pages

10 Tables

Appendix:14 Pages

Abstract

A key factor in effective Self-Supervised learning (SSL) is preventing dimensional collapse, where higher-dimensional representation spaces ( $R$ ) span a lower-dimensional subspace. Therefore, SSL optimization strategies involve guiding a model to produce $R$ with a higher dimensionality ( $H(R)$ ) through objectives that encourage decorrelation of features or sample uniformity in $R$ . A higher $H(R)$ indicates that $R$ has greater feature diversity which is useful for generalization to downstream tasks. Alongside dimensionality optimization, SSL algorithms also utilize a projection head that maps $R$ into an embedding space $Z$ . Recent work has characterized the projection head as a filter of noisy or irrelevant features from the SSL objective by reducing the mutual information $I(R;Z)$ . Therefore, the current literature's view is that a good SSL representation space should have a high $H(R)$ and a low $I(R;Z)$ . However, this view of SSL is lacking in terms of an understanding of the underlying training dynamics that influences the relationship between both terms. Our analysis shows that the best performing SSL models do not have the highest $H(R)$ nor the lowest $I(R;Z)$ , but effectively arrive at a balance between both. To take advantage of this analysis, we introduce AdaDim, a training strategy that leverages SSL training dynamics by adaptively balancing between increasing $H(R)$ through feature decorrelation and sample uniformity as well as gradual regularization of $I(R;Z)$ as training progresses. We show performance improvements of up to 3% over common SSL baselines despite our method not utilizing expensive techniques such as queues, clustering, predictor networks, or student-teacher architectures.

View on arXiv

Comments on this paper