Dissecting Hessian: Understanding Common Structure of Hessian in Neural Networks

8 October 2020

Papers citing "Dissecting Hessian: Understanding Common Structure of Hessian in Neural Networks"

11 / 11 papers shown

Title
Theoretical characterisation of the Gauss-Newton conditioning in Neural Networks Jim Zhao Sidak Pal Singh Aurelien Lucchi AI4CE 45 0 0 04 Nov 2024
Unraveling the Hessian: A Key to Smooth Convergence in Loss Function Landscapes Nikita Kiselev Andrey Grabovoy 54 1 0 18 Sep 2024
Loss Gradient Gaussian Width based Generalization and Optimization Guarantees A. Banerjee Qiaobo Li Yingxue Zhou 49 0 0 11 Jun 2024
Achieving Dimension-Free Communication in Federated Learning via Zeroth-Order Optimization Zhe Li Bicheng Ying Zidong Liu Haibo Yang Haibo Yang FedML 59 3 0 24 May 2024
Q-Newton: Hybrid Quantum-Classical Scheduling for Accelerating Neural Network Training with Newton's Gradient Descent Pingzhi Li Junyu Liu Hanrui Wang Tianlong Chen 84 1 0 30 Apr 2024
Analyzing Sharpness along GD Trajectory: Progressive Sharpening and Edge of Stability Z. Li Zixuan Wang Jian Li 19 42 0 26 Jul 2022
Federated Optimization of Smooth Loss Functions Ali Jadbabaie A. Makur Devavrat Shah FedML 21 7 0 06 Jan 2022
Does the Data Induce Capacity Control in Deep Learning? Rubing Yang J. Mao Pratik Chaudhari 27 15 0 27 Oct 2021
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima N. Keskar Dheevatsa Mudigere J. Nocedal M. Smelyanskiy P. T. P. Tang ODL 284 2,889 0 15 Sep 2016
A short note on the tail bound of Wishart distribution Shenghuo Zhu 80 17 0 24 Dec 2012
Pac-Bayesian Supervised Classification: The Thermodynamics of Statistical Learning O. Catoni 148 454 0 03 Dec 2007