v1v2 (latest)

Convergence of continuous-time stochastic gradient descent with applications to deep neural networks

11 September 2024

Gabor Lugosi

Eulalia Nualart

ArXiv (abs)PDF HTML

Main:21 Pages

Bibliography:2 Pages

Abstract

We study a continuous-time approximation of the stochastic gradient descent process for minimizing the population expected loss in learning problems. The main results establish general sufficient conditions for the convergence, extending the results of Chatterjee (2022) established for (nonstochastic) gradient descent. We show how the main result can be applied to the case of overparametrized neural network training.

View on arXiv

Comments on this paper