14
2

Overparametrized linear dimensionality reductions: From projection pursuit to two-layer neural networks

Abstract

Given a cloud of nn data points in Rd\mathbb{R}^d, consider all projections onto mm-dimensional subspaces of Rd\mathbb{R}^d and, for each such projection, the empirical distribution of the projected points. What does this collection of probability distributions look like when n,dn,d grow large?We consider this question under the null model in which the points are i.i.d. standard Gaussian vectors, focusing on the asymptotic regime in which n,dn,d\to\infty, with n/dα(0,)n/d\to\alpha\in (0,\infty), while mm is fixed. Denoting by Fm,α\mathscr{F}_{m, \alpha} the set of probability distributions in Rm\mathbb{R}^m that arise as low-dimensional projections in this limit, we establish new inner and outer bounds on Fm,α\mathscr{F}_{m, \alpha}. In particular, we characterize the Wasserstein radius of Fm,α\mathscr{F}_{m,\alpha} up to constant multiplicative factors, and determine it exactly for m=1m=1. We also prove sharp bounds in terms of Kullback-Leibler divergence and Rényi information dimension.The previous question has application to unsupervised learning methods, such as projection pursuit and independent component analysis. We introduce a version of the same problem that is relevant for supervised learning, and prove a sharp Wasserstein radius bound. As an application, we establish an upper bound on the interpolation threshold of two-layers neural networks with mm hidden neurons.

View on arXiv
@article{montanari2025_2206.06526,
  title={ Overparametrized linear dimensionality reductions: From projection pursuit to two-layer neural networks },
  author={ Andrea Montanari and Kangjie Zhou },
  journal={arXiv preprint arXiv:2206.06526},
  year={ 2025 }
}
Comments on this paper