Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2006.00719
Cited By
ADAHESSIAN: An Adaptive Second Order Optimizer for Machine Learning
1 June 2020
Z. Yao
A. Gholami
Sheng Shen
Mustafa Mustafa
Kurt Keutzer
Michael W. Mahoney
ODL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"ADAHESSIAN: An Adaptive Second Order Optimizer for Machine Learning"
50 / 150 papers shown
Title
Second-Order Fine-Tuning without Pain for LLMs:A Hessian Informed Zeroth-Order Optimizer
Yanjun Zhao
Sizhe Dang
Haishan Ye
Guang Dai
Yi Qian
Ivor W.Tsang
66
8
0
23 Feb 2024
Beyond Uniform Scaling: Exploring Depth Heterogeneity in Neural Architectures
Akash Guna R.T
Arnav Chavan
Deepak Gupta
MDE
32
0
0
19 Feb 2024
Mini-Hes: A Parallelizable Second-order Latent Factor Analysis Model
Jialiang Wang
Weiling Li
Yurong Zhong
Xin Luo
27
0
0
19 Feb 2024
Stochastic Hessian Fittings with Lie Groups
Xi-Lin Li
43
1
0
19 Feb 2024
Preconditioners for the Stochastic Training of Implicit Neural Representations
Shin-Fang Chng
Hemanth Saratchandran
Simon Lucey
26
0
0
13 Feb 2024
Tradeoffs of Diagonal Fisher Information Matrix Estimators
Alexander Soen
Ke Sun
24
1
0
08 Feb 2024
Curvature-Informed SGD via General Purpose Lie-Group Preconditioners
Omead Brandon Pooladzandi
Xi-Lin Li
38
4
0
07 Feb 2024
SANIA: Polyak-type Optimization Framework Leads to Scale Invariant Stochastic Algorithms
Farshed Abdukhakimov
Chulu Xiang
Dmitry Kamzolov
Robert Mansel Gower
Martin Takáč
43
2
0
28 Dec 2023
AGD: an Auto-switchable Optimizer using Stepwise Gradient Difference for Preconditioning Matrix
Yun Yue
Zhiling Ye
Jiadi Jiang
Yongchao Liu
Ke Zhang
ODL
26
1
0
04 Dec 2023
Temperature Balancing, Layer-wise Weight Analysis, and Neural Network Training
Yefan Zhou
Tianyu Pang
Keqin Liu
Charles H. Martin
Michael W. Mahoney
Yaoqing Yang
42
7
0
01 Dec 2023
Data-efficient operator learning for solving high Mach number fluid flow problems
Noah Ford
Victor J. Leon
Honest Mrema
Jeffrey Gilbert
Alexander New
AI4CE
24
0
0
28 Nov 2023
Signal Processing Meets SGD: From Momentum to Filter
Zhipeng Yao
Guisong Chang
Jiaqi Zhang
Qi Zhang
Dazhou Li
Yu Zhang
ODL
37
0
0
06 Nov 2023
Kronecker-Factored Approximate Curvature for Modern Neural Network Architectures
Runa Eschenhagen
Alexander Immer
Richard Turner
Frank Schneider
Philipp Hennig
61
21
0
01 Nov 2023
Information-Theoretic Trust Regions for Stochastic Gradient-Based Optimization
Philipp Dahlinger
P. Becker
Maximilian Hüttenrauch
Gerhard Neumann
15
0
0
31 Oct 2023
AdaSub: Stochastic Optimization Using Second-Order Information in Low-Dimensional Subspaces
João Victor Galvão da Mata
Martin S. Andersen
13
1
0
30 Oct 2023
Studying K-FAC Heuristics by Viewing Adam through a Second-Order Lens
Ross M. Clarke
José Miguel Hernández-Lobato
46
2
0
23 Oct 2023
Towards Hyperparameter-Agnostic DNN Training via Dynamical System Insights
Carmel Fiscko
Aayushya Agarwal
Yihan Ruan
S. Kar
L. Pileggi
Bruno Sinopoli
20
0
0
21 Oct 2023
Stochastic Gradient Descent with Preconditioned Polyak Step-size
Farshed Abdukhakimov
Chulu Xiang
Dmitry Kamzolov
Martin Takáč
31
5
0
03 Oct 2023
Eva: A General Vectorized Approximation Framework for Second-order Optimization
Lin Zhang
S. Shi
Bo-wen Li
28
1
0
04 Aug 2023
Flatness-Aware Minimization for Domain Generalization
Xingxuan Zhang
Renzhe Xu
Han Yu
Yancheng Dong
Pengfei Tian
Peng Cu
32
20
0
20 Jul 2023
Bidirectional Looking with A Novel Double Exponential Moving Average to Adaptive and Non-adaptive Momentum Optimizers
Yineng Chen
Z. Li
Lefei Zhang
Bo Du
Hai Zhao
33
4
0
02 Jul 2023
G-TRACER: Expected Sharpness Optimization
John R. Williams
Stephen J. Roberts
35
0
0
24 Jun 2023
FlakyFix: Using Large Language Models for Predicting Flaky Test Fix Categories and Test Code Repair
Sakina Fatima
Hadi Hemmati
Lionel C. Briand
34
4
0
21 Jun 2023
Understanding Optimization of Deep Learning via Jacobian Matrix and Lipschitz Constant
Xianbiao Qi
Jianan Wang
Lei Zhang
18
0
0
15 Jun 2023
Error Feedback Can Accurately Compress Preconditioners
Ionut-Vlad Modoranu
A. Kalinov
Eldar Kurtic
Elias Frantar
Dan Alistarh
ODL
16
4
0
09 Jun 2023
Searching for Optimal Per-Coordinate Step-sizes with Multidimensional Backtracking
Frederik Kunstner
V. S. Portella
Mark W. Schmidt
Nick Harvey
28
8
0
05 Jun 2023
Towards Sustainable Learning: Coresets for Data-efficient Deep Learning
Yu Yang
Hao Kang
Baharan Mirzasoleiman
38
34
0
02 Jun 2023
KrADagrad: Kronecker Approximation-Domination Gradient Preconditioned Stochastic Optimization
Jonathan Mei
Alexander Moreno
Luke Walters
ODL
29
1
0
30 May 2023
Minibatching Offers Improved Generalization Performance for Second Order Optimizers
Eric Silk
Swarnita Chakraborty
N. Dasgupta
Anand D. Sarwate
A. Lumsdaine
Tony Chiang
ODL
13
0
0
26 May 2023
SING: A Plug-and-Play DNN Learning Technique
Adrien Courtois
Damien Scieur
Jean-Michel Morel
Pablo Arias
Thomas Eboli
36
0
0
25 May 2023
Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training
Hong Liu
Zhiyuan Li
David Leo Wright Hall
Percy Liang
Tengyu Ma
VLM
55
132
0
23 May 2023
Layer-wise Adaptive Step-Sizes for Stochastic First-Order Methods for Deep Learning
Achraf Bahamou
D. Goldfarb
ODL
36
0
0
23 May 2023
ASDL: A Unified Interface for Gradient Preconditioning in PyTorch
Kazuki Osawa
Satoki Ishikawa
Rio Yokota
Shigang Li
Torsten Hoefler
ODL
38
14
0
08 May 2023
ISAAC Newton: Input-based Approximate Curvature for Newton's Method
Felix Petersen
Tobias Sutter
Christian Borgelt
Dongsung Huh
Hilde Kuehne
Yuekai Sun
Oliver Deussen
ODL
31
5
0
01 May 2023
Loss-Curvature Matching for Dataset Selection and Condensation
Seung-Jae Shin
Heesun Bae
DongHyeok Shin
Weonyoung Joo
Il-Chul Moon
DD
49
24
0
08 Mar 2023
FOSI: Hybrid First and Second Order Optimization
Hadar Sivan
Moshe Gabel
Assaf Schuster
ODL
34
2
0
16 Feb 2023
Gradient Shaping: Enhancing Backdoor Attack Against Reverse Engineering
Rui Zhu
Di Tang
Siyuan Tang
Guanhong Tao
Shiqing Ma
Xiaofeng Wang
Haixu Tang
DD
23
3
0
29 Jan 2023
Projective Integral Updates for High-Dimensional Variational Inference
J. Duersch
35
1
0
20 Jan 2023
Task Weighting in Meta-learning with Trajectory Optimisation
Cuong C. Nguyen
Thanh-Toan Do
G. Carneiro
31
3
0
04 Jan 2023
CODEBench: A Neural Architecture and Hardware Accelerator Co-Design Framework
Shikhar Tuli
Chia-Hao Li
Ritvik Sharma
N. Jha
36
13
0
07 Dec 2022
A survey of deep learning optimizers -- first and second order methods
Rohan Kashyap
ODL
37
6
0
28 Nov 2022
On the Effectiveness of Parameter-Efficient Fine-Tuning
Z. Fu
Haoran Yang
Anthony Man-Cho So
Wai Lam
Lidong Bing
Nigel Collier
27
156
0
28 Nov 2022
Black Box Lie Group Preconditioners for SGD
Xi-Lin Li
13
8
0
08 Nov 2022
Adaptive scaling of the learning rate by second order automatic differentiation
F. Gournay
Alban Gossard
ODL
31
1
0
26 Oct 2022
An Efficient Nonlinear Acceleration method that Exploits Symmetry of the Hessian
Huan He
Shifan Zhao
Z. Tang
Joyce C. Ho
Y. Saad
Yuanzhe Xi
32
3
0
22 Oct 2022
HesScale: Scalable Computation of Hessian Diagonals
Mohamed Elsayed
A. R. Mahmood
22
7
0
20 Oct 2022
Tunable Complexity Benchmarks for Evaluating Physics-Informed Neural Networks on Coupled Ordinary Differential Equations
Alexander New
B. Eng
A. Timm
A. Gearhart
20
4
0
14 Oct 2022
Exploring Contextual Representation and Multi-Modality for End-to-End Autonomous Driving
Shoaib Azam
Farzeen Munir
Ville Kyrki
M. Jeon
Witold Pedrycz
56
1
0
13 Oct 2022
Compute-Efficient Deep Learning: Algorithmic Trends and Opportunities
Brian Bartoldson
B. Kailkhura
Davis W. Blalock
31
47
0
13 Oct 2022
Learning to Optimize Quasi-Newton Methods
Isaac Liao
Rumen Dangovski
Jakob N. Foerster
Marin Soljacic
38
4
0
11 Oct 2022
Previous
1
2
3
Next