Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1905.09899
Cited By
Blockwise Adaptivity: Faster Training and Better Generalization in Deep Learning
23 May 2019
Shuai Zheng
James T. Kwok
ODL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Blockwise Adaptivity: Faster Training and Better Generalization in Deep Learning"
4 / 4 papers shown
Title
A Hessian-informed hyperparameter optimization for differential learning rate
Shiyun Xu
Zhiqi Bu
Yiliang Zhang
Ian Barnett
39
1
0
12 Jan 2025
SPAM: Spike-Aware Adam with Momentum Reset for Stable LLM Training
Tianjin Huang
Ziquan Zhu
Gaojie Jin
Lu Liu
Zhangyang Wang
Shiwei Liu
42
1
0
12 Jan 2025
Deconstructing What Makes a Good Optimizer for Language Models
Rosie Zhao
Depen Morwani
David Brandfonbrener
Nikhil Vyas
Sham Kakade
47
17
0
10 Jul 2024
Stochastic Gradient Descent for Non-smooth Optimization: Convergence Results and Optimal Averaging Schemes
Ohad Shamir
Tong Zhang
101
570
0
08 Dec 2012
1