112
v1v2v3v4 (latest)

Gaussian Universality in Neural Network Dynamics with Generalized Structured Input Distributions

Journal of Statistical Mechanics: Theory and Experiment (JSTAT), 2024
Main:31 Pages
16 Figures
1 Tables
Abstract

Analyzing neural network dynamics via stochastic gradient descent (SGD) is crucial to building theoretical foundations for deep learning. Previous work has analyzed structured inputs within the \textit{hidden manifold model}, often under the simplifying assumption of a Gaussian distribution. We extend this framework by modeling inputs as Gaussian mixtures to better represent complex, real-world data. Through empirical and theoretical investigation, we demonstrate that with proper standardization, the learning dynamics converges to the behavior seen in the simple Gaussian case. This finding exhibits a form of universality, where diverse structured distributions yield results consistent with Gaussian assumptions, thereby strengthening the theoretical understanding of deep learning models.

View on arXiv
Comments on this paper