Gathering and Exploiting Higher-Order Information when Training Large Structured Models

v1v2v3v4 (latest)

Gathering and Exploiting Higher-Order Information when Training Large Structured Models

6 December 2023

Pierre Wolinski

ArXiv (abs)PDF HTML

Papers citing "Gathering and Exploiting Higher-Order Information when Training Large Structured Models"

11 / 11 papers shown

Title
Sharpness-Aware Minimization for Efficiently Improving GeneralizationInternational Conference on Learning Representations (ICLR), 2020 Pierre Foret Ariel Kleiner H. Mobahi Behnam Neyshabur AAML 526 1,600 0 03 Oct 2020
Similarity of Neural Network Representations RevisitedInternational Conference on Machine Learning (ICML), 2019 Simon Kornblith Mohammad Norouzi Honglak Lee Geoffrey E. Hinton 851 1,679 0 01 May 2019
Are All Layers Created Equal? Chiyuan Zhang Samy Bengio Y. Singer 206 155 0 06 Feb 2019
Neural Tangent Kernel: Convergence and Generalization in Neural Networks Arthur Jacot Franck Gabriel Clément Hongler 920 3,552 0 20 Jun 2018
Block Mean Approximation for Efficient Second Order Optimization Yao Lu Mehrtash Harandi Leonid Sigal Razvan Pascanu ODL 89 4 0 16 Apr 2018
Empirical Analysis of the Hessian of Over-Parametrized Neural NetworksInternational Conference on Learning Representations (ICLR), 2017 Levent Sagun Utku Evci V. U. Güney Yann N. Dauphin Léon Bottou 231 435 0 14 Jun 2017
Sharp Minima Can Generalize For Deep Nets Laurent Dinh Razvan Pascanu Samy Bengio Yoshua Bengio ODL 297 818 0 15 Mar 2017
Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs) Djork-Arné Clevert Thomas Unterthiner Sepp Hochreiter 459 5,828 0 23 Nov 2015
Optimizing Neural Networks with Kronecker-factored Approximate Curvature James Martens Roger C. Grosse ODL 599 1,125 0 19 Mar 2015
Very Deep Convolutional Networks for Large-Scale Image RecognitionInternational Conference on Learning Representations (ICLR), 2014 Karen Simonyan Andrew Zisserman FAtt MDE 3.0K 106,699 0 04 Sep 2014
Riemannian metrics for neural networks I: feedforward networks Yann Ollivier 274 104 0 04 Mar 2013