On Linear Stability of SGD and Input-Smoothness of Neural Networks

On Linear Stability of SGD and Input-Smoothness of Neural Networks

27 May 2021

Chao Ma

Papers citing "On Linear Stability of SGD and Input-Smoothness of Neural Networks"

13 / 13 papers shown

Title
Reasoning Bias of Next Token Prediction Training Pengxiao Lin Zhongwang Zhang Zhi-Qin John Xu LRM 88 1 0 21 Feb 2025
Sharpness-Aware Minimization Efficiently Selects Flatter Minima Late in Training Zhanpeng Zhou Mingze Wang Yuchen Mao Bingrui Li Junchi Yan AAML 62 0 0 14 Oct 2024
High dimensional analysis reveals conservative sharpening and a stochastic edge of stability Atish Agarwala Jeffrey Pennington 41 3 0 30 Apr 2024
Sharpness Minimization Algorithms Do Not Only Minimize Sharpness To Achieve Better Generalization Kaiyue Wen Zhiyuan Li Tengyu Ma FAtt 36 26 0 20 Jul 2023
How to escape sharp minima with random perturbations Kwangjun Ahn Ali Jadbabaie S. Sra ODL 29 6 0 25 May 2023
mSAM: Micro-Batch-Averaged Sharpness-Aware Minimization Kayhan Behdin Qingquan Song Aman Gupta S. Keerthi Ayan Acharya Borja Ocejo Gregory Dexter Rajiv Khanna D. Durfee Rahul Mazumder AAML 15 7 0 19 Feb 2023
On the Lipschitz Constant of Deep Networks and Double Descent Matteo Gamba Hossein Azizpour Marten Bjorkman 21 7 0 28 Jan 2023
Deep Double Descent via Smooth Interpolation Matteo Gamba Erik Englesson Marten Bjorkman Hossein Azizpour 59 10 0 21 Sep 2022
On the Implicit Bias in Deep-Learning Algorithms Gal Vardi FedML AI4CE 32 72 0 26 Aug 2022
A Unified Analysis of Mixed Sample Data Augmentation: A Loss Function Perspective Chanwoo Park Sangdoo Yun Sanghyuk Chun AAML 18 32 0 21 Aug 2022
Understanding the Generalization Benefit of Normalization Layers: Sharpness Reduction Kaifeng Lyu Zhiyuan Li Sanjeev Arora FAtt 37 69 0 14 Jun 2022
Beyond the Quadratic Approximation: the Multiscale Structure of Neural Network Loss Landscapes Chao Ma D. Kunin Lei Wu Lexing Ying 25 27 0 24 Apr 2022
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima N. Keskar Dheevatsa Mudigere J. Nocedal M. Smelyanskiy P. T. P. Tang ODL 281 2,888 0 15 Sep 2016