10

Task-Driven Kernel Flows: Label Rank Compression and Laplacian Spectral Filtering

Hongxi Li
Chunlin Huang
Main:31 Pages
3 Figures
Bibliography:2 Pages
2 Tables
Appendix:14 Pages
Abstract

We present a theory of feature learning in wide L2-regularized networks showing that supervised learning is inherently compressive. We derive a kernel ODE that predicts a "water-filling" spectral evolution and prove that for any stable steady state, the kernel rank is bounded by the number of classes (CC). We further demonstrate that SGD noise is similarly low-rank (O(C)O(C)), confining dynamics to the task-relevant subspace. This framework unifies the deterministic and stochastic views of alignment and contrasts the low-rank nature of supervised learning with the high-rank, expansive representations of self-supervision.

View on arXiv
Comments on this paper