On the Heavy-Tailed Theory of Stochastic Gradient Descent for Deep
Neural Networks

On the Heavy-Tailed Theory of Stochastic Gradient Descent for Deep Neural Networks

29 November 2019

Mert Gurbuzbalaban

Papers citing "On the Heavy-Tailed Theory of Stochastic Gradient Descent for Deep Neural Networks"

13 / 13 papers shown

Title
Nonlinear Stochastic Gradient Descent and Heavy-tailed Noise: A Unified Framework and High-probability Guarantees Aleksandar Armacki Shuhua Yu Pranay Sharma Gauri Joshi Dragana Bajović D. Jakovetić S. Kar 57 2 0 17 Oct 2024
Differential Private Stochastic Optimization with Heavy-tailed Data: Towards Optimal Rates Puning Zhao Jiafei Wu Zhe Liu Chong Wang Rongfei Fan Qingming Li 45 1 0 19 Aug 2024
Stochastic Nonsmooth Convex Optimization with Heavy-Tailed Noises: High-Probability Bound, In-Expectation Rate and Initial Distance Adaptation Zijian Liu Zhengyuan Zhou 24 10 0 22 Mar 2023
Breaking the Lower Bound with (Little) Structure: Acceleration in Non-Convex Stochastic Optimization with Heavy-Tailed Noise Zijian Liu Jiawei Zhang Zhengyuan Zhou 32 12 0 14 Feb 2023
Cyclic and Randomized Stepsizes Invoke Heavier Tails in SGD than Constant Stepsize Mert Gurbuzbalaban Yuanhan Hu Umut Simsekli Lingjiong Zhu LRM 20 1 0 10 Feb 2023
Heavy-Tail Phenomenon in Decentralized SGD Mert Gurbuzbalaban Yuanhan Hu Umut Simsekli Kun Yuan Lingjiong Zhu 32 8 0 13 May 2022
Nonlinear gradient mappings and stochastic optimization: A general framework with applications to heavy-tail noise D. Jakovetić Dragana Bajović Anit Kumar Sahu S. Kar Nemanja Milošević Dusan Stamenkovic 17 12 0 06 Apr 2022
Intrinsic Dimension, Persistent Homology and Generalization in Neural Networks Tolga Birdal Aaron Lou Leonidas J. Guibas Umut cSimcsekli 27 61 0 25 Nov 2021
Exponential escape efficiency of SGD from sharp minima in non-stationary regime Hikaru Ibayashi Masaaki Imaizumi 26 4 0 07 Nov 2021
Fractal Structure and Generalization Properties of Stochastic Optimization Algorithms A. Camuto George Deligiannidis Murat A. Erdogdu Mert Gurbuzbalaban Umut cSimcsekli Lingjiong Zhu 27 29 0 09 Jun 2021
Convergence Rates of Stochastic Gradient Descent under Infinite Noise Variance Hongjian Wang Mert Gurbuzbalaban Lingjiong Zhu Umut cSimcsekli Murat A. Erdogdu 15 41 0 20 Feb 2021
Hausdorff Dimension, Heavy Tails, and Generalization in Neural Networks Umut Simsekli Ozan Sener George Deligiannidis Murat A. Erdogdu 44 55 0 16 Jun 2020
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima N. Keskar Dheevatsa Mudigere J. Nocedal M. Smelyanskiy P. T. P. Tang ODL 281 2,889 0 15 Sep 2016