Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2107.01739
Cited By
v1
v2 (latest)
KAISA: An Adaptive Second-Order Optimizer Framework for Deep Neural Networks
4 July 2021
J. G. Pauloski
Qi Huang
Lei Huang
Shivaram Venkataraman
Kyle Chard
Ian Foster
Zhao-jie Zhang
Re-assign community
ArXiv (abs)
PDF
HTML
Github (87★)
Papers citing
"KAISA: An Adaptive Second-Order Optimizer Framework for Deep Neural Networks"
21 / 21 papers shown
Title
Purifying Shampoo: Investigating Shampoo's Heuristics by Decomposing its Preconditioner
Runa Eschenhagen
Aaron Defazio
Tsung-Hsien Lee
Richard Turner
Hao-Jun Michael Shi
73
0
0
04 Jun 2025
Accelerating Deep Neural Network Training via Distributed Hybrid Order Optimization
Shunxian Gu
Chaoqun You
Bangbang Ren
Lailong Luo
Junxu Xia
Deke Guo
70
0
0
02 May 2025
SLoPe: Double-Pruned Sparse Plus Lazy Low-Rank Adapter Pretraining of LLMs
Mohammad Mozaffari
Amir Yazdanbakhsh
Zhao Zhang
M. Dehnavi
144
7
0
28 Jan 2025
Influence Functions for Scalable Data Attribution in Diffusion Models
Bruno Mlodozeniec
Runa Eschenhagen
Juhan Bae
Alexander Immer
David Krueger
Richard E. Turner
DiffM
TDI
151
7
0
17 Oct 2024
Adversarial Vulnerability as a Consequence of On-Manifold Inseparibility
Rajdeep Haldar
Yue Xing
Qifan Song
Guang Lin
42
0
0
09 Oct 2024
PETScML: Second-order solvers for training regression problems in Scientific Machine Learning
Stefano Zampini
Umberto Zerbinati
George Turkyyiah
David E. Keyes
62
5
0
18 Mar 2024
On the Parameterization of Second-Order Optimization Effective Towards the Infinite Width
Satoki Ishikawa
Ryo Karakida
78
2
0
19 Dec 2023
Merging by Matching Models in Task Parameter Subspaces
Derek Tam
Mohit Bansal
Colin Raffel
MoMe
99
12
0
07 Dec 2023
Eva: A General Vectorized Approximation Framework for Second-order Optimization
Lin Zhang
Shaoshuai Shi
Yue Liu
69
1
0
04 Aug 2023
Mirage: Towards Low-interruption Services on Batch GPU Clusters with Reinforcement Learning
Qi-Dong Ding
Pengfei Zheng
Shreyas Kudari
Shivaram Venkataraman
Zhao-jie Zhang
VLM
OffRL
47
3
0
25 Jun 2023
Fine-grained Policy-driven I/O Sharing for Burst Buffers
E. Karrels
Lei Huang
Yuhong Kan
Ishank Arora
Yinzhi Wang
Daniel S. Katz
W. Gropp
Zhao-jie Zhang
24
3
0
20 Jun 2023
MKOR: Momentum-Enabled Kronecker-Factor-Based Optimizer Using Rank-1 Updates
Mohammad Mozaffari
Sikan Li
Zhao Zhang
M. Dehnavi
63
4
0
02 Jun 2023
Minibatching Offers Improved Generalization Performance for Second Order Optimizers
Eric Silk
Swarnita Chakraborty
N. Dasgupta
Anand D. Sarwate
A. Lumsdaine
Tony Chiang
ODL
21
0
0
26 May 2023
On Efficient Training of Large-Scale Deep Learning Models: A Literature Review
Li Shen
Yan Sun
Zhiyuan Yu
Liang Ding
Xinmei Tian
Dacheng Tao
VLM
105
43
0
07 Apr 2023
FOSI: Hybrid First and Second Order Optimization
Hadar Sivan
Moshe Gabel
Assaf Schuster
ODL
74
2
0
16 Feb 2023
PipeFisher: Efficient Training of Large Language Models Using Pipelining and Fisher Information Matrices
Kazuki Osawa
Shigang Li
Torsten Hoefler
AI4CE
84
26
0
25 Nov 2022
Compute-Efficient Deep Learning: Algorithmic Trends and Opportunities
Brian Bartoldson
B. Kailkhura
Davis W. Blalock
107
50
0
13 Oct 2022
Efficient Quantized Sparse Matrix Operations on Tensor Cores
Shigang Li
Kazuki Osawa
Torsten Hoefler
160
32
0
14 Sep 2022
PoF: Post-Training of Feature Extractor for Improving Generalization
Ikuro Sato
Ryota Yamada
Masayuki Tanaka
Nakamasa Inoue
Rei Kawakami
37
4
0
05 Jul 2022
Scalable K-FAC Training for Deep Neural Networks with Distributed Preconditioning
Lin Zhang
Shaoshuai Shi
Wei Wang
Yue Liu
65
10
0
30 Jun 2022
Scale-invariant Learning by Physics Inversion
Philipp Holl
V. Koltun
Nils Thuerey
PINN
AI4CE
76
9
0
30 Sep 2021
1