ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1712.07296
  4. Cited By
Block-diagonal Hessian-free Optimization for Training Neural Networks

Block-diagonal Hessian-free Optimization for Training Neural Networks

20 December 2017
Huishuai Zhang
Caiming Xiong
James Bradbury
R. Socher
    ODL
ArXiv (abs)PDFHTML

Papers citing "Block-diagonal Hessian-free Optimization for Training Neural Networks"

8 / 8 papers shown
Title
Understanding Why Adam Outperforms SGD: Gradient Heterogeneity in Transformers
Understanding Why Adam Outperforms SGD: Gradient Heterogeneity in Transformers
Akiyoshi Tomihari
Issei Sato
ODL
137
3
0
31 Jan 2025
Debiasing Mini-Batch Quadratics for Applications in Deep Learning
Debiasing Mini-Batch Quadratics for Applications in Deep Learning
Lukas Tatzel
Bálint Mucsányi
Osane Hackel
Philipp Hennig
110
0
0
18 Oct 2024
Batch Normalization Preconditioning for Neural Network Training
Batch Normalization Preconditioning for Neural Network Training
Susanna Lange
Kyle E. Helfrich
Qiang Ye
64
9
0
02 Aug 2021
ViViT: Curvature access through the generalized Gauss-Newton's low-rank
  structure
ViViT: Curvature access through the generalized Gauss-Newton's low-rank structure
Felix Dangel
Lukas Tatzel
Philipp Hennig
83
13
0
04 Jun 2021
Whitening and second order optimization both make information in the
  dataset unusable during training, and can reduce or prevent generalization
Whitening and second order optimization both make information in the dataset unusable during training, and can reduce or prevent generalization
Neha S. Wadia
Daniel Duckworth
S. Schoenholz
Ethan Dyer
Jascha Narain Sohl-Dickstein
92
13
0
17 Aug 2020
DeepOBS: A Deep Learning Optimizer Benchmark Suite
DeepOBS: A Deep Learning Optimizer Benchmark Suite
Frank Schneider
Lukas Balles
Philipp Hennig
ODL
124
71
0
13 Mar 2019
Small steps and giant leaps: Minimal Newton solvers for Deep Learning
Small steps and giant leaps: Minimal Newton solvers for Deep Learning
João F. Henriques
Sébastien Ehrhardt
Samuel Albanie
Andrea Vedaldi
ODL
44
22
0
21 May 2018
EA-CG: An Approximate Second-Order Method for Training Fully-Connected
  Neural Networks
EA-CG: An Approximate Second-Order Method for Training Fully-Connected Neural Networks
Sheng-Wei Chen
Chun-Nan Chou
Edward Y. Chang
22
5
0
19 Feb 2018
1