Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2302.00195
Cited By
Weight Prediction Boosts the Convergence of AdamW
1 February 2023
Lei Guan
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Weight Prediction Boosts the Convergence of AdamW"
7 / 7 papers shown
Title
Optimizing Large Language Models for ESG Activity Detection in Financial Texts
Mattia Birti
Francesco Osborne
Andrea Maurino
44
0
0
28 Feb 2025
One Step Learning, One Step Review
Xiaolong Huang
Qiankun Li
Xueran Li
Xuesong Gao
33
1
0
19 Jan 2024
PipeOptim: Ensuring Effective 1F1B Schedule with Optimizer-Dependent Weight Prediction
Lei Guan
Dongsheng Li
Jiye Liang
Wenjian Wang
Wenjian Wang
Xicheng Lu
37
1
0
01 Dec 2023
AdaPlus: Integrating Nesterov Momentum and Precise Stepsize Adjustment on AdamW Basis
Lei Guan
ODL
34
3
0
05 Sep 2023
XGrad: Boosting Gradient-Based Optimizers With Weight Prediction
Lei Guan
Dongsheng Li
Yanqi Shi
Jian Meng
ODL
41
2
0
26 May 2023
Are Transformers More Robust Than CNNs?
Yutong Bai
Jieru Mei
Alan Yuille
Cihang Xie
ViT
AAML
192
257
0
10 Nov 2021
Densely Connected Convolutional Networks
Gao Huang
Zhuang Liu
L. V. D. van der Maaten
Kilian Q. Weinberger
PINN
3DV
312
36,381
0
25 Aug 2016
1