ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1906.02926
  4. Cited By
The Normalization Method for Alleviating Pathological Sharpness in Wide
  Neural Networks

The Normalization Method for Alleviating Pathological Sharpness in Wide Neural Networks

7 June 2019
Ryo Karakida
S. Akaho
S. Amari
ArXivPDFHTML

Papers citing "The Normalization Method for Alleviating Pathological Sharpness in Wide Neural Networks"

13 / 13 papers shown
Title
Non-identifiability distinguishes Neural Networks among Parametric Models
Non-identifiability distinguishes Neural Networks among Parametric Models
Sourav Chatterjee
Timothy Sudijono
30
0
0
25 Apr 2025
Component-Wise Natural Gradient Descent -- An Efficient Neural Network
  Optimization
Component-Wise Natural Gradient Descent -- An Efficient Neural Network Optimization
Tran van Sang
Mhd Irvan
R. Yamaguchi
Toshiyuki Nakata
13
1
0
11 Oct 2022
Analyzing Sharpness along GD Trajectory: Progressive Sharpening and Edge
  of Stability
Analyzing Sharpness along GD Trajectory: Progressive Sharpening and Edge of Stability
Z. Li
Zixuan Wang
Jian Li
19
42
0
26 Jul 2022
Understanding the Generalization Benefit of Normalization Layers:
  Sharpness Reduction
Understanding the Generalization Benefit of Normalization Layers: Sharpness Reduction
Kaifeng Lyu
Zhiyuan Li
Sanjeev Arora
FAtt
40
69
0
14 Jun 2022
Asymptotic Freeness of Layerwise Jacobians Caused by Invariance of
  Multilayer Perceptron: The Haar Orthogonal Case
Asymptotic Freeness of Layerwise Jacobians Caused by Invariance of Multilayer Perceptron: The Haar Orthogonal Case
B. Collins
Tomohiro Hayase
22
7
0
24 Mar 2021
ASAM: Adaptive Sharpness-Aware Minimization for Scale-Invariant Learning
  of Deep Neural Networks
ASAM: Adaptive Sharpness-Aware Minimization for Scale-Invariant Learning of Deep Neural Networks
Jungmin Kwon
Jeongseop Kim
Hyunseong Park
I. Choi
33
281
0
23 Feb 2021
Group Whitening: Balancing Learning Efficiency and Representational
  Capacity
Group Whitening: Balancing Learning Efficiency and Representational Capacity
Lei Huang
Yi Zhou
Li Liu
Fan Zhu
Ling Shao
25
20
0
28 Sep 2020
When Does Preconditioning Help or Hurt Generalization?
When Does Preconditioning Help or Hurt Generalization?
S. Amari
Jimmy Ba
Roger C. Grosse
Xuechen Li
Atsushi Nitanda
Taiji Suzuki
Denny Wu
Ji Xu
34
32
0
18 Jun 2020
The Spectrum of Fisher Information of Deep Networks Achieving Dynamical
  Isometry
The Spectrum of Fisher Information of Deep Networks Achieving Dynamical Isometry
Tomohiro Hayase
Ryo Karakida
29
7
0
14 Jun 2020
Any Target Function Exists in a Neighborhood of Any Sufficiently Wide
  Random Network: A Geometrical Perspective
Any Target Function Exists in a Neighborhood of Any Sufficiently Wide Random Network: A Geometrical Perspective
S. Amari
24
12
0
20 Jan 2020
Dynamical Isometry and a Mean Field Theory of CNNs: How to Train
  10,000-Layer Vanilla Convolutional Neural Networks
Dynamical Isometry and a Mean Field Theory of CNNs: How to Train 10,000-Layer Vanilla Convolutional Neural Networks
Lechao Xiao
Yasaman Bahri
Jascha Narain Sohl-Dickstein
S. Schoenholz
Jeffrey Pennington
227
348
0
14 Jun 2018
Universal Statistics of Fisher Information in Deep Neural Networks: Mean
  Field Approach
Universal Statistics of Fisher Information in Deep Neural Networks: Mean Field Approach
Ryo Karakida
S. Akaho
S. Amari
FedML
47
140
0
04 Jun 2018
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp
  Minima
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
N. Keskar
Dheevatsa Mudigere
J. Nocedal
M. Smelyanskiy
P. T. P. Tang
ODL
296
2,890
0
15 Sep 2016
1