Practical Sharpness-Aware Minimization Cannot Converge All the Way to Optima

16 June 2023

Papers citing "Practical Sharpness-Aware Minimization Cannot Converge All the Way to Optima"

18 / 18 papers shown

Title
Sharpness-Aware Minimization: General Analysis and Improved Rates Dimitris Oikonomou Nicolas Loizou 55 0 0 04 Mar 2025
LORENZA: Enhancing Generalization in Low-Rank Gradient LLM Training via Efficient Zeroth-Order Adaptive SAM Yehonathan Refael Iftach Arbel Ofir Lindenbaum Tom Tirer 59 0 0 26 Feb 2025
Implicit Regularization of Sharpness-Aware Minimization for Scale-Invariant Problems Bingcong Li Liang Zhang Niao He 36 2 0 18 Oct 2024
Combinatorial Multi-armed Bandits: Arm Selection via Group Testing Arpan Mukherjee Shashanka Ubaru K. Murugesan Karthikeyan Shanmugam A. Tajer 27 0 0 14 Oct 2024
Sharpness-Aware Minimization Efficiently Selects Flatter Minima Late in Training Zhanpeng Zhou Mingze Wang Yuchen Mao Bingrui Li Junchi Yan AAML 42 0 0 14 Oct 2024
Convergence of Sharpness-Aware Minimization Algorithms using Increasing Batch Size and Decaying Learning Rate Hinata Harada Hideaki Iiduka 18 1 0 16 Sep 2024
Learning Scalable Model Soup on a Single GPU: An Efficient Subspace Training Strategy Tao Li Weisen Jiang Fanghui Liu X. Huang James T. Kwok MoMe 38 0 0 04 Jul 2024
Does SGD really happen in tiny subspaces? Minhak Song Kwangjun Ahn Chulhee Yun 36 4 1 25 May 2024
SADDLe: Sharpness-Aware Decentralized Deep Learning with Heterogeneous Data Sakshi Choudhary Sai Aparna Aketi Kaushik Roy FedML 32 0 0 22 May 2024
Critical Influence of Overparameterization on Sharpness-aware Minimization Sungbin Shin Dongyeop Lee Maksym Andriushchenko Namhoon Lee AAML 34 1 0 29 Nov 2023
PLASTIC: Improving Input and Label Plasticity for Sample Efficient Reinforcement Learning Hojoon Lee Hanseul Cho Hyunseung Kim Daehoon Gwak Joonkee Kim Jaegul Choo Se-Young Yun Chulhee Yun OffRL 56 25 0 19 Jun 2023
The Crucial Role of Normalization in Sharpness-Aware Minimization Yan Dai Kwangjun Ahn S. Sra 8 17 0 24 May 2023
An Adaptive Policy to Employ Sharpness-Aware Minimization Weisen Jiang Hansi Yang Yu Zhang James T. Kwok AAML 69 31 0 28 Apr 2023
AdaSAM: Boosting Sharpness-Aware Minimization with Adaptive Learning Rate and Momentum for Training Deep Neural Networks Hao Sun Li Shen Qihuang Zhong Liang Ding Shi-Yong Chen Jingwei Sun Jing Li Guangzhong Sun Dacheng Tao 35 31 0 01 Mar 2023
The Dynamics of Sharpness-Aware Minimization: Bouncing Across Ravines and Drifting Towards Wide Minima Peter L. Bartlett Philip M. Long Olivier Bousquet 52 34 0 04 Oct 2022
Sharpness-Aware Minimization Improves Language Model Generalization Dara Bahri H. Mobahi Yi Tay 110 82 0 16 Oct 2021
Efficient Sharpness-aware Minimization for Improved Training of Neural Networks Jiawei Du Hanshu Yan Jiashi Feng Joey Tianyi Zhou Liangli Zhen Rick Siow Mong Goh Vincent Y. F. Tan AAML 99 132 0 07 Oct 2021
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima N. Keskar Dheevatsa Mudigere J. Nocedal M. Smelyanskiy P. T. P. Tang ODL 273 2,696 0 15 Sep 2016