ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2312.03885
  4. Cited By
Gathering and Exploiting Higher-Order Information when Training Large Structured Models
v1v2v3v4 (latest)

Gathering and Exploiting Higher-Order Information when Training Large Structured Models

6 December 2023
Pierre Wolinski
    ODL
ArXiv (abs)PDFHTML

Papers citing "Gathering and Exploiting Higher-Order Information when Training Large Structured Models"

11 / 11 papers shown
Title
Sharpness-Aware Minimization for Efficiently Improving Generalization
Sharpness-Aware Minimization for Efficiently Improving GeneralizationInternational Conference on Learning Representations (ICLR), 2020
Pierre Foret
Ariel Kleiner
H. Mobahi
Behnam Neyshabur
AAML
526
1,600
0
03 Oct 2020
Similarity of Neural Network Representations Revisited
Similarity of Neural Network Representations RevisitedInternational Conference on Machine Learning (ICML), 2019
Simon Kornblith
Mohammad Norouzi
Honglak Lee
Geoffrey E. Hinton
851
1,679
0
01 May 2019
Are All Layers Created Equal?
Are All Layers Created Equal?
Chiyuan Zhang
Samy Bengio
Y. Singer
206
155
0
06 Feb 2019
Neural Tangent Kernel: Convergence and Generalization in Neural Networks
Neural Tangent Kernel: Convergence and Generalization in Neural Networks
Arthur Jacot
Franck Gabriel
Clément Hongler
920
3,552
0
20 Jun 2018
Block Mean Approximation for Efficient Second Order Optimization
Block Mean Approximation for Efficient Second Order Optimization
Yao Lu
Mehrtash Harandi
Leonid Sigal
Razvan Pascanu
ODL
89
4
0
16 Apr 2018
Empirical Analysis of the Hessian of Over-Parametrized Neural Networks
Empirical Analysis of the Hessian of Over-Parametrized Neural NetworksInternational Conference on Learning Representations (ICLR), 2017
Levent Sagun
Utku Evci
V. U. Güney
Yann N. Dauphin
Léon Bottou
231
435
0
14 Jun 2017
Sharp Minima Can Generalize For Deep Nets
Sharp Minima Can Generalize For Deep Nets
Laurent Dinh
Razvan Pascanu
Samy Bengio
Yoshua Bengio
ODL
297
818
0
15 Mar 2017
Fast and Accurate Deep Network Learning by Exponential Linear Units
  (ELUs)
Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs)
Djork-Arné Clevert
Thomas Unterthiner
Sepp Hochreiter
459
5,828
0
23 Nov 2015
Optimizing Neural Networks with Kronecker-factored Approximate Curvature
Optimizing Neural Networks with Kronecker-factored Approximate Curvature
James Martens
Roger C. Grosse
ODL
599
1,125
0
19 Mar 2015
Very Deep Convolutional Networks for Large-Scale Image Recognition
Very Deep Convolutional Networks for Large-Scale Image RecognitionInternational Conference on Learning Representations (ICLR), 2014
Karen Simonyan
Andrew Zisserman
FAttMDE
3.0K
106,699
0
04 Sep 2014
Riemannian metrics for neural networks I: feedforward networks
Riemannian metrics for neural networks I: feedforward networks
Yann Ollivier
274
104
0
04 Mar 2013
1