Eigenvalue distribution of the Neural Tangent Kernel in the quadratic scaling

27 August 2025

Main:40 Pages

8 Figures

Bibliography:2 Pages

Abstract

We compute the asymptotic eigenvalue distribution of the neural tangent kernel of a two-layer neural network under a specific scaling of dimension. Namely, if $X\in\mathbb{R}^{n\times d}$ is an i.i.d random matrix, $W\in\mathbb{R}^{d\times p}$ is an i.i.d $\mathcal{N}(0,1)$ matrix and $D\in\mathbb{R}^{p\times p}$ is a diagonal matrix with i.i.d bounded entries, we consider the matrix\[\mathrm{NTK}=\frac{1}{d}XX^\top\odot\frac{1}{p}\sigma'\left(\frac{1}{\sqrt{d}}XW\right)D^2\sigma'\left(\frac{1}{\sqrt{d}}XW\right)^\top\]where $\sigma'$ is a pseudo-Lipschitz function applied entrywise and under the scaling $\frac{n}{dp}\to \gamma_1$ and $\frac{p}{d}\to \gamma_2$ . We describe the asymptotic distribution as the free multiplicative convolution of the Marchenko--Pastur distribution with a deterministic distribution depending on $\sigma$ and $D$ .

View on arXiv

Comments on this paper