Task-Driven Kernel Flows: Label Rank Compression and Laplacian Spectral Filtering
Hongxi Li
Chunlin Huang
Main:31 Pages
3 Figures
Bibliography:2 Pages
2 Tables
Appendix:14 Pages
Abstract
We present a theory of feature learning in wide L2-regularized networks showing that supervised learning is inherently compressive. We derive a kernel ODE that predicts a "water-filling" spectral evolution and prove that for any stable steady state, the kernel rank is bounded by the number of classes (). We further demonstrate that SGD noise is similarly low-rank (), confining dynamics to the task-relevant subspace. This framework unifies the deterministic and stochastic views of alignment and contrasts the low-rank nature of supervised learning with the high-rank, expansive representations of self-supervision.
View on arXivComments on this paper
