Understanding the Generalization Benefit of Normalization Layers:
Sharpness Reduction

Understanding the Generalization Benefit of Normalization Layers: Sharpness Reduction

14 June 2022

Papers citing "Understanding the Generalization Benefit of Normalization Layers: Sharpness Reduction"

14 / 14 papers shown

Title
Novel Concept-Oriented Synthetic Data approach for Training Generative AI-Driven Crystal Grain Analysis Using Diffusion Model Ahmed Sobhi Saleh Kristof Croes Hajdin Ceric Ingrid De Wolf Houman Zahedmanesh DiffM 24 0 0 21 Apr 2025
Minimax Optimal Convergence of Gradient Descent in Logistic Regression via Large and Adaptive Stepsizes Ruiqi Zhang Jingfeng Wu Licong Lin Peter L. Bartlett 20 0 0 05 Apr 2025
Sharpness-Aware Minimization Efficiently Selects Flatter Minima Late in Training Zhanpeng Zhou Mingze Wang Yuchen Mao Bingrui Li Junchi Yan AAML 57 0 0 14 Oct 2024
Does SGD really happen in tiny subspaces? Minhak Song Kwangjun Ahn Chulhee Yun 56 4 1 25 May 2024
Sharpness Minimization Algorithms Do Not Only Minimize Sharpness To Achieve Better Generalization Kaiyue Wen Zhiyuan Li Tengyu Ma FAtt 22 26 0 20 Jul 2023
Gradient Descent Monotonically Decreases the Sharpness of Gradient Flow Solutions in Scalar Networks and Beyond Itai Kreisler Mor Shpigel Nacson Daniel Soudry Y. Carmon 23 13 0 22 May 2023
Implicit Bias of Gradient Descent for Logistic Regression at the Edge of Stability Jingfeng Wu Vladimir Braverman Jason D. Lee 24 16 0 19 May 2023
The Geometry of Neural Nets' Parameter Spaces Under Reparametrization Agustinus Kristiadi Felix Dangel Philipp Hennig 17 10 0 14 Feb 2023
Understanding Incremental Learning of Gradient Descent: A Fine-grained Analysis of Matrix Sensing Jikai Jin Zhiyuan Li Kaifeng Lyu S. Du Jason D. Lee MLT 31 34 0 27 Jan 2023
Learning threshold neurons via the "edge of stability" Kwangjun Ahn Sébastien Bubeck Sinho Chewi Y. Lee Felipe Suarez Yi Zhang MLT 31 36 0 14 Dec 2022
How Does Sharpness-Aware Minimization Minimize Sharpness? Kaiyue Wen Tengyu Ma Zhiyuan Li AAML 21 47 0 10 Nov 2022
Understanding Edge-of-Stability Training Dynamics with a Minimalist Example Xingyu Zhu Zixuan Wang Xiang Wang Mo Zhou Rong Ge 64 35 0 07 Oct 2022
On the Implicit Bias in Deep-Learning Algorithms Gal Vardi FedML AI4CE 25 72 0 26 Aug 2022
Analyzing Sharpness along GD Trajectory: Progressive Sharpening and Edge of Stability Z. Li Zixuan Wang Jian Li 11 42 0 26 Jul 2022