Escaping Saddles with Stochastic Gradients

15 March 2018

Papers citing "Escaping Saddles with Stochastic Gradients"

25 / 25 papers shown

Title
Loss Landscape of Shallow ReLU-like Neural Networks: Stationary Points, Saddle Escape, and Network Embedding Zhengqing Wu Berfin Simsek Francois Ged ODL 38 0 0 08 Feb 2024
Score-Aware Policy-Gradient Methods and Performance Guarantees using Local Lyapunov Conditions: Applications to Product-Form Stochastic Networks and Queueing Systems Céline Comte Matthieu Jonckheere J. Sanders Albert Senen-Cerda 25 0 0 05 Dec 2023
How to escape sharp minima with random perturbations Kwangjun Ahn Ali Jadbabaie S. Sra ODL 24 6 0 25 May 2023
Stochastic Dimension-reduced Second-order Methods for Policy Optimization Jinsong Liu Chen Xie Qinwen Deng Dongdong Ge Yi-Li Ye 19 1 0 28 Jan 2023
An SDE for Modeling SAM: Theory and Insights Enea Monzio Compagnoni Luca Biggio Antonio Orvieto F. Proske Hans Kersting Aurélien Lucchi 21 13 0 19 Jan 2023
Escaping Saddle Points for Effective Generalization on Class-Imbalanced Data Harsh Rangwani Sumukh K Aithal Mayank Mishra R. Venkatesh Babu 26 27 0 28 Dec 2022
On the Overlooked Structure of Stochastic Gradients Zeke Xie Qian-Yuan Tang Mingming Sun P. Li 23 6 0 05 Dec 2022
Passage-Mask: A Learnable Regularization Strategy for Retriever-Reader Models Shujian Zhang Chengyue Gong Xingchao Liu RALM 37 6 0 02 Nov 2022
Behind the Scenes of Gradient Descent: A Trajectory Analysis via Basis Function Decomposition Jianhao Ma Li-Zhen Guo S. Fattahi 34 4 0 01 Oct 2022
Tackling benign nonconvexity with smoothing and stochastic gradients Harsh Vardhan Sebastian U. Stich 16 8 0 18 Feb 2022
Non-Asymptotic Analysis of Online Multiplicative Stochastic Gradient Descent Riddhiman Bhattacharya Tiefeng Jiang 8 0 0 14 Dec 2021
Exponential escape efficiency of SGD from sharp minima in non-stationary regime Hikaru Ibayashi Masaaki Imaizumi 26 4 0 07 Nov 2021
Faster Perturbed Stochastic Gradient Methods for Finding Local Minima Zixiang Chen Dongruo Zhou Quanquan Gu 25 1 0 25 Oct 2021
The loss landscape of deep linear neural networks: a second-order analysis E. M. Achour Franccois Malgouyres Sébastien Gerchinovitz ODL 22 9 0 28 Jul 2021
Positive-Negative Momentum: Manipulating Stochastic Gradient Noise to Improve Generalization Zeke Xie Li-xin Yuan Zhanxing Zhu Masashi Sugiyama 13 29 0 31 Mar 2021
Provable Super-Convergence with a Large Cyclical Learning Rate Samet Oymak 28 12 0 22 Feb 2021
Learning explanations that are hard to vary Giambattista Parascandolo Alexander Neitz Antonio Orvieto Luigi Gresele Bernhard Schölkopf FAtt 13 178 0 01 Sep 2020
On the Almost Sure Convergence of Stochastic Gradient Descent in Non-Convex Problems P. Mertikopoulos Nadav Hallak Ali Kavis V. Cevher 6 85 0 19 Jun 2020
Shadowing Properties of Optimization Algorithms Antonio Orvieto Aurélien Lucchi 17 18 0 12 Nov 2019
Global Convergence of Policy Gradient Methods to (Almost) Locally Optimal Policies K. Zhang Alec Koppel Haoqi Zhu Tamer Basar 25 186 0 19 Jun 2019
On the Noisy Gradient Descent that Generalizes as SGD Jingfeng Wu Wenqing Hu Haoyi Xiong Jun Huan Vladimir Braverman Zhanxing Zhu MLT 16 10 0 18 Jun 2019
A Tail-Index Analysis of Stochastic Gradient Noise in Deep Neural Networks Umut Simsekli Levent Sagun Mert Gurbuzbalaban 13 237 0 18 Jan 2019
SGD Converges to Global Minimum in Deep Learning via Star-convex Path Yi Zhou Junjie Yang Huishuai Zhang Yingbin Liang Vahid Tarokh 14 71 0 02 Jan 2019
Stochastic Nested Variance Reduction for Nonconvex Optimization Dongruo Zhou Pan Xu Quanquan Gu 22 146 0 20 Jun 2018
The Loss Surfaces of Multilayer Networks A. Choromańska Mikael Henaff Michaël Mathieu Gerard Ben Arous Yann LeCun ODL 177 1,185 0 30 Nov 2014