270
v1v2 (latest)

Non-asymptotic approximations of Gaussian neural networks via second-order Poincaré inequalities

Symposium on Advances in Approximate Bayesian Inference (AABI), 2023
Main:11 Pages
3 Figures
Bibliography:5 Pages
Appendix:18 Pages
Abstract

There is a recent and growing literature on large-width asymptotic and non-asymptotic properties of deep Gaussian neural networks (NNs), namely NNs with weights initialized as Gaussian distributions. For a Gaussian NN of depth L1L\geq1 and width n1n\geq1, it is well-known that, as n+n\rightarrow+\infty, the NN's output converges (in distribution) to a Gaussian process. Recently, some quantitative versions of this result, also known as quantitative central limit theorems (QCLTs), have been obtained, showing that the rate of convergence is n1n^{-1}, in the 22-Wasserstein distance, and that such a rate is optimal. In this paper, we investigate the use of second-order Poincaré inequalities as an alternative approach to establish QCLTs for the NN's output. Previous approaches consist of a careful analysis of the NN, by combining non-trivial probabilistic tools with ad-hoc techniques that rely on the recursive definition of the network, typically by means of an induction argument over the layers, and it is unclear if and how they still apply to other NN's architectures. Instead, the use of second-order Poincaré inequalities rely only on the fact that the NN is a functional of a Gaussian process, reducing the problem of establishing QCLTs to the algebraic problem of computing the gradient and Hessian of the NN's output, which still applies to other NN's architectures. We show how our approach is effective in establishing QCLTs for the NN's output, though it leads to suboptimal rates of convergence. We argue that such a worsening in the rates is peculiar to second-order Poincaré inequalities, and it should be interpreted as the "cost" for having a straightforward, and general, procedure for obtaining QCLTs.

View on arXiv
Comments on this paper