Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2012.13841
Cited By
Understanding Decoupled and Early Weight Decay
27 December 2020
Johan Bjorck
Kilian Q. Weinberger
Carla P. Gomes
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Understanding Decoupled and Early Weight Decay"
5 / 5 papers shown
Title
Scaling Optimal LR Across Token Horizons
Johan Bjorck
Alon Benhaim
Vishrav Chaudhary
Furu Wei
Xia Song
54
4
0
30 Sep 2024
VoP: Text-Video Co-operative Prompt Tuning for Cross-Modal Retrieval
Siteng Huang
Biao Gong
Yulin Pan
Jianwen Jiang
Yiliang Lv
Yuyuan Li
Donglin Wang
VLM
VPVLM
22
41
0
23 Nov 2022
Cyclical Focal Loss
L. Smith
30
14
0
16 Feb 2022
Understanding AdamW through Proximal Methods and Scale-Freeness
Zhenxun Zhuang
Mingrui Liu
Ashok Cutkosky
Francesco Orabona
37
63
0
31 Jan 2022
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
N. Keskar
Dheevatsa Mudigere
J. Nocedal
M. Smelyanskiy
P. T. P. Tang
ODL
281
2,889
0
15 Sep 2016
1