A Deeper Look at the Hessian Eigenspectrum of Deep Neural Networks and its Applications to Regularization

7 December 2020

Papers citing "A Deeper Look at the Hessian Eigenspectrum of Deep Neural Networks and its Applications to Regularization"

7 / 7 papers shown

Title
A Hessian-informed hyperparameter optimization for differential learning rate Shiyun Xu Zhiqi Bu Yiliang Zhang Ian J. Barnett 39 1 0 12 Jan 2025
Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training Hong Liu Zhiyuan Li David Leo Wright Hall Percy Liang Tengyu Ma VLM 27 128 0 23 May 2023
Sketchy: Memory-efficient Adaptive Regularization with Frequent Directions Vladimir Feinberg Xinyi Chen Y. Jennifer Sun Rohan Anil Elad Hazan 23 12 0 07 Feb 2023
Generalisation under gradient descent via deterministic PAC-Bayes Eugenio Clerico Tyler Farghly George Deligiannidis Benjamin Guedj Arnaud Doucet 26 4 0 06 Sep 2022
When Do Flat Minima Optimizers Work? Jean Kaddour Linqing Liu Ricardo M. A. Silva Matt J. Kusner ODL 11 58 0 01 Feb 2022
On the Power-Law Hessian Spectrums in Deep Learning Zeke Xie Qian-Yuan Tang Yunfeng Cai Mingming Sun P. Li ODL 42 8 0 31 Jan 2022
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima N. Keskar Dheevatsa Mudigere J. Nocedal M. Smelyanskiy P. T. P. Tang ODL 281 2,888 0 15 Sep 2016