ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1811.07062
  4. Cited By
The Full Spectrum of Deepnet Hessians at Scale: Dynamics with SGD
  Training and Sample Size
v1v2 (latest)

The Full Spectrum of Deepnet Hessians at Scale: Dynamics with SGD Training and Sample Size

16 November 2018
Vardan Papyan
ArXiv (abs)PDFHTML

Papers citing "The Full Spectrum of Deepnet Hessians at Scale: Dynamics with SGD Training and Sample Size"

25 / 25 papers shown
Title
The Persistence of Neural Collapse Despite Low-Rank Bias: An Analytic
  Perspective Through Unconstrained Features
The Persistence of Neural Collapse Despite Low-Rank Bias: An Analytic Perspective Through Unconstrained Features
Connall Garrod
Jonathan P. Keating
65
4
0
30 Oct 2024
Exact Gauss-Newton Optimization for Training Deep Neural Networks
Exact Gauss-Newton Optimization for Training Deep Neural Networks
Mikalai Korbit
Adeyemi Damilare Adeoye
Alberto Bemporad
Mario Zanon
ODL
58
1
0
23 May 2024
Unifying Low Dimensional Observations in Deep Learning Through the Deep
  Linear Unconstrained Feature Model
Unifying Low Dimensional Observations in Deep Learning Through the Deep Linear Unconstrained Feature Model
Connall Garrod
Jonathan P. Keating
109
9
0
09 Apr 2024
Symmetric Neural-Collapse Representations with Supervised Contrastive
  Loss: The Impact of ReLU and Batching
Symmetric Neural-Collapse Representations with Supervised Contrastive Loss: The Impact of ReLU and Batching
Ganesh Ramachandra Kini
V. Vakilian
Tina Behnia
Jaidev Gill
Christos Thrampoulidis
72
2
0
13 Jun 2023
Complexity from Adaptive-Symmetries Breaking: Global Minima in the
  Statistical Mechanics of Deep Neural Networks
Complexity from Adaptive-Symmetries Breaking: Global Minima in the Statistical Mechanics of Deep Neural Networks
Shaun Li
AI4CE
75
0
0
03 Jan 2022
MIO : Mutual Information Optimization using Self-Supervised Binary Contrastive Learning
MIO : Mutual Information Optimization using Self-Supervised Binary Contrastive Learning
Siladittya Manna
Umapada Pal
Saumik Bhattacharya
SSL
123
1
0
24 Nov 2021
Recent advances in deep learning theory
Recent advances in deep learning theory
Fengxiang He
Dacheng Tao
AI4CE
130
51
0
20 Dec 2020
A Deeper Look at the Hessian Eigenspectrum of Deep Neural Networks and
  its Applications to Regularization
A Deeper Look at the Hessian Eigenspectrum of Deep Neural Networks and its Applications to Regularization
Adepu Ravi Sankar
Yash Khasbage
Rahul Vigneswaran
V. Balasubramanian
89
44
0
07 Dec 2020
Traces of Class/Cross-Class Structure Pervade Deep Learning Spectra
Traces of Class/Cross-Class Structure Pervade Deep Learning Spectra
Vardan Papyan
64
80
0
27 Aug 2020
Prevalence of Neural Collapse during the terminal phase of deep learning
  training
Prevalence of Neural Collapse during the terminal phase of deep learning training
Vardan Papyan
Xuemei Han
D. Donoho
252
582
0
18 Aug 2020
Exploring Weight Importance and Hessian Bias in Model Pruning
Exploring Weight Importance and Hessian Bias in Model Pruning
Mingchen Li
Yahya Sattar
Christos Thrampoulidis
Samet Oymak
71
4
0
19 Jun 2020
Learning Rates as a Function of Batch Size: A Random Matrix Theory
  Approach to Neural Network Training
Learning Rates as a Function of Batch Size: A Random Matrix Theory Approach to Neural Network Training
Diego Granziol
S. Zohren
Stephen J. Roberts
ODL
148
50
0
16 Jun 2020
Layer-wise Conditioning Analysis in Exploring the Learning Dynamics of
  DNNs
Layer-wise Conditioning Analysis in Exploring the Learning Dynamics of DNNs
Lei Huang
Jie Qin
Li Liu
Fan Zhu
Ling Shao
AI4CE
86
11
0
25 Feb 2020
The Geometry of Sign Gradient Descent
The Geometry of Sign Gradient Descent
Lukas Balles
Fabian Pedregosa
Nicolas Le Roux
ODL
88
27
0
19 Feb 2020
Scaling Laws for Neural Language Models
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
668
4,937
0
23 Jan 2020
Deep Curvature Suite
Deep Curvature Suite
Diego Granziol
Xingchen Wan
T. Garipov
3DV
50
12
0
20 Dec 2019
On the Heavy-Tailed Theory of Stochastic Gradient Descent for Deep
  Neural Networks
On the Heavy-Tailed Theory of Stochastic Gradient Descent for Deep Neural Networks
Umut Simsekli
Mert Gurbuzbalaban
T. H. Nguyen
G. Richard
Levent Sagun
88
59
0
29 Nov 2019
Geometry of learning neural quantum states
Geometry of learning neural quantum states
Chae-Yeun Park
M. Kastoryano
72
63
0
24 Oct 2019
GradVis: Visualization and Second Order Analysis of Optimization
  Surfaces during the Training of Deep Neural Networks
GradVis: Visualization and Second Order Analysis of Optimization Surfaces during the Training of Deep Neural Networks
Avraam Chatzimichailidis
Franz-Josef Pfreundt
N. Gauger
J. Keuper
52
10
0
26 Sep 2019
Hessian based analysis of SGD for Deep Nets: Dynamics and Generalization
Hessian based analysis of SGD for Deep Nets: Dynamics and Generalization
Xinyan Li
Qilong Gu
Yingxue Zhou
Tiancong Chen
A. Banerjee
ODL
88
52
0
24 Jul 2019
First Exit Time Analysis of Stochastic Gradient Descent Under
  Heavy-Tailed Gradient Noise
First Exit Time Analysis of Stochastic Gradient Descent Under Heavy-Tailed Gradient Noise
T. H. Nguyen
Umut Simsekli
Mert Gurbuzbalaban
G. Richard
79
65
0
21 Jun 2019
Negative eigenvalues of the Hessian in deep neural networks
Negative eigenvalues of the Hessian in deep neural networks
Guillaume Alain
Nicolas Le Roux
Pierre-Antoine Manzagol
71
44
0
06 Feb 2019
An Investigation into Neural Net Optimization via Hessian Eigenvalue
  Density
An Investigation into Neural Net Optimization via Hessian Eigenvalue Density
Behrooz Ghorbani
Shankar Krishnan
Ying Xiao
ODL
113
326
0
29 Jan 2019
Ambitious Data Science Can Be Painless
Ambitious Data Science Can Be Painless
Hatef Monajemi
Riccardo Murri
Eric Jonas
Percy Liang
V. Stodden
D. Donoho
135
13
0
25 Jan 2019
Measurements of Three-Level Hierarchical Structure in the Outliers in
  the Spectrum of Deepnet Hessians
Measurements of Three-Level Hierarchical Structure in the Outliers in the Spectrum of Deepnet Hessians
Vardan Papyan
82
88
0
24 Jan 2019
1