How to escape sharp minima with random perturbations

How to escape sharp minima with random perturbations

25 May 2023

Papers citing "How to escape sharp minima with random perturbations"

11 / 11 papers shown

Title
Gradient Extrapolation for Debiased Representation Learning Ihab Asaad M. Shadaydeh Joachim Denzler 33 0 0 17 Mar 2025
Sharpness-Aware Minimization Efficiently Selects Flatter Minima Late in Training Zhanpeng Zhou Mingze Wang Yuchen Mao Bingrui Li Junchi Yan AAML 57 0 0 14 Oct 2024
On the Trade-off between Flatness and Optimization in Distributed Learning Ying Cao Zhaoxian Wu Kun Yuan Ali H. Sayed 25 1 0 28 Jun 2024
Does SGD really happen in tiny subspaces? Minhak Song Kwangjun Ahn Chulhee Yun 47 4 1 25 May 2024
The Dynamics of Sharpness-Aware Minimization: Bouncing Across Ravines and Drifting Towards Wide Minima Peter L. Bartlett Philip M. Long Olivier Bousquet 63 34 0 04 Oct 2022
Understanding Gradient Descent on Edge of Stability in Deep Learning Sanjeev Arora Zhiyuan Li A. Panigrahi MLT 72 88 0 19 May 2022
Sharpness-Aware Minimization Improves Language Model Generalization Dara Bahri H. Mobahi Yi Tay 119 82 0 16 Oct 2021
What Happens after SGD Reaches Zero Loss? --A Mathematical Framework Zhiyuan Li Tianhao Wang Sanjeev Arora MLT 83 98 0 13 Oct 2021
Large Learning Rate Tames Homogeneity: Convergence and Balancing Effect Yuqing Wang Minshuo Chen T. Zhao Molei Tao AI4CE 52 40 0 07 Oct 2021
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima N. Keskar Dheevatsa Mudigere J. Nocedal M. Smelyanskiy P. T. P. Tang ODL 273 2,878 0 15 Sep 2016
Linear Convergence of Gradient and Proximal-Gradient Methods Under the Polyak-Łojasiewicz Condition Hamed Karimi J. Nutini Mark W. Schmidt 119 1,190 0 16 Aug 2016