Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2010.04261
Cited By
Dissecting Hessian: Understanding Common Structure of Hessian in Neural Networks
8 October 2020
Yikai Wu
Xingyu Zhu
Chenwei Wu
Annie Wang
Rong Ge
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Dissecting Hessian: Understanding Common Structure of Hessian in Neural Networks"
11 / 11 papers shown
Title
Theoretical characterisation of the Gauss-Newton conditioning in Neural Networks
Jim Zhao
Sidak Pal Singh
Aurelien Lucchi
AI4CE
45
0
0
04 Nov 2024
Unraveling the Hessian: A Key to Smooth Convergence in Loss Function Landscapes
Nikita Kiselev
Andrey Grabovoy
54
1
0
18 Sep 2024
Loss Gradient Gaussian Width based Generalization and Optimization Guarantees
A. Banerjee
Qiaobo Li
Yingxue Zhou
49
0
0
11 Jun 2024
Achieving Dimension-Free Communication in Federated Learning via Zeroth-Order Optimization
Zhe Li
Bicheng Ying
Zidong Liu
Haibo Yang
Haibo Yang
FedML
59
3
0
24 May 2024
Q-Newton: Hybrid Quantum-Classical Scheduling for Accelerating Neural Network Training with Newton's Gradient Descent
Pingzhi Li
Junyu Liu
Hanrui Wang
Tianlong Chen
84
1
0
30 Apr 2024
Analyzing Sharpness along GD Trajectory: Progressive Sharpening and Edge of Stability
Z. Li
Zixuan Wang
Jian Li
19
42
0
26 Jul 2022
Federated Optimization of Smooth Loss Functions
Ali Jadbabaie
A. Makur
Devavrat Shah
FedML
21
7
0
06 Jan 2022
Does the Data Induce Capacity Control in Deep Learning?
Rubing Yang
J. Mao
Pratik Chaudhari
30
15
0
27 Oct 2021
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
N. Keskar
Dheevatsa Mudigere
J. Nocedal
M. Smelyanskiy
P. T. P. Tang
ODL
284
2,889
0
15 Sep 2016
A short note on the tail bound of Wishart distribution
Shenghuo Zhu
80
17
0
24 Dec 2012
Pac-Bayesian Supervised Classification: The Thermodynamics of Statistical Learning
O. Catoni
148
454
0
03 Dec 2007
1