Overparametrized linear dimensionality reductions: From projection pursuit to two-layer neural networks

14 June 2022

Abstract

Given a cloud of $n$ data points in $\mathbb{R}^d$ , consider all projections onto $m$ -dimensional subspaces of $\mathbb{R}^d$ and, for each such projection, the empirical distribution of the projected points. What does this collection of probability distributions look like when $n,d$ grow large?We consider this question under the null model in which the points are i.i.d. standard Gaussian vectors, focusing on the asymptotic regime in which $n,d\to\infty$ , with $n/d\to\alpha\in (0,\infty)$ , while $m$ is fixed. Denoting by $\mathscr{F}_{m, \alpha}$ the set of probability distributions in $\mathbb{R}^m$ that arise as low-dimensional projections in this limit, we establish new inner and outer bounds on $\mathscr{F}_{m, \alpha}$ . In particular, we characterize the Wasserstein radius of $\mathscr{F}_{m,\alpha}$ up to constant multiplicative factors, and determine it exactly for $m=1$ . We also prove sharp bounds in terms of Kullback-Leibler divergence and Rényi information dimension.The previous question has application to unsupervised learning methods, such as projection pursuit and independent component analysis. We introduce a version of the same problem that is relevant for supervised learning, and prove a sharp Wasserstein radius bound. As an application, we establish an upper bound on the interpolation threshold of two-layers neural networks with $m$ hidden neurons.

View on arXiv

@article{montanari2025_2206.06526,
  title={ Overparametrized linear dimensionality reductions: From projection pursuit to two-layer neural networks },
  author={ Andrea Montanari and Kangjie Zhou },
  journal={arXiv preprint arXiv:2206.06526},
  year={ 2025 }
}

Comments on this paper