Strength of Minibatch Noise in SGD

v1v2v3 (latest)

Strength of Minibatch Noise in SGD

10 February 2021

ArXiv (abs)PDF HTML

Papers citing "Strength of Minibatch Noise in SGD"

10 / 10 papers shown

Title
SGD as Free Energy Minimization: A Thermodynamic View on Neural Network Training Ildus Sadrtdinov Ivan Klimov E. Lobacheva Dmitry Vetrov 28 0 0 29 May 2025
Decentralized SGD and Average-direction SAM are Asymptotically Equivalent Tongtian Zhu Fengxiang He Kaixuan Chen Mingli Song Dacheng Tao 154 15 0 05 Jun 2023
Evolutionary Algorithms in the Light of SGD: Limit Equivalence, Minima Flatness, and Transfer Learning Andrei Kucharavy R. Guerraoui Ljiljana Dolamic 104 1 0 20 May 2023
Training trajectories, mini-batch losses and the curious role of the learning rate Mark Sandler A. Zhmoginov Max Vladymyrov Nolan Miller ODL 85 12 0 05 Jan 2023
On the Overlooked Structure of Stochastic Gradients Zeke Xie Qian-Yuan Tang Mingming Sun P. Li 94 6 0 05 Dec 2022
Two Facets of SDE Under an Information-Theoretic Lens: Generalization of SGD via Training Trajectories and via Terminal States Ziqiao Wang Yongyi Mao 111 12 0 19 Nov 2022
Noise Injection Node Regularization for Robust Learning N. Levi I. Bloch M. Freytsis T. Volansky AI4CE 64 2 0 27 Oct 2022
SGD with Large Step Sizes Learns Sparse Features Maksym Andriushchenko Aditya Varre Loucas Pillaud-Vivien Nicolas Flammarion 143 60 0 11 Oct 2022
Stochastic Neural Networks with Infinite Width are Deterministic Liu Ziyin Hanlin Zhang Xiangming Meng Yuting Lu Eric P. Xing Masakuni Ueda 91 3 0 30 Jan 2022
SGD with a Constant Large Learning Rate Can Converge to Local Maxima Liu Ziyin Botao Li James B. Simon Masakuni Ueda 88 9 0 25 Jul 2021