Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2103.12682
Cited By
How to decay your learning rate
23 March 2021
Aitor Lewkowycz
Re-assign community
ArXiv
PDF
HTML
Papers citing
"How to decay your learning rate"
13 / 13 papers shown
Title
Sapiens: Foundation for Human Vision Models
Rawal Khirodkar
Timur M. Bagautdinov
Julieta Martinez
Su Zhaoen
Austin James
Peter Selednik
Stuart Anderson
Shunsuke Saito
VLM
38
64
0
22 Aug 2024
Carrying over algorithm in transformers
J. Kruthoff
24
0
0
15 Jan 2024
Building a Llama2-finetuned LLM for Odia Language Utilizing Domain Knowledge Instruction Set
Guneet Singh Kohli
Shantipriya Parida
Sambit Sekhar
Samirit Saha
Nipun B. Nair
Parul Agarwal
Sonal Khosla
Kusumlata Patiyal
Debasish Dhal
32
13
0
19 Dec 2023
Using Stochastic Gradient Descent to Smooth Nonconvex Functions: Analysis of Implicit Graduated Optimization with Optimal Noise Scheduling
Naoki Sato
Hideaki Iiduka
20
3
0
15 Nov 2023
An Automatic Learning Rate Schedule Algorithm for Achieving Faster Convergence and Steeper Descent
Zhao-quan Song
Chiwun Yang
27
9
0
17 Oct 2023
Why Do We Need Weight Decay in Modern Deep Learning?
Maksym Andriushchenko
Francesco DÁngelo
Aditya Varre
Nicolas Flammarion
26
27
0
06 Oct 2023
Time-sensitive Learning for Heterogeneous Federated Edge Intelligence
Yong Xiao
Xiaohan Zhang
Guangming Shi
Marwan Krunz
Diep N. Nguyen
D. Hoang
21
15
0
26 Jan 2023
Learning Rate Perturbation: A Generic Plugin of Learning Rate Schedule towards Flatter Local Minima
Hengyu Liu
Qiang Fu
Lun Du
Tiancheng Zhang
Gensitskiy Yu.
Shi Han
Dongmei Zhang
118
3
0
25 Aug 2022
Optimal learning rate schedules in high-dimensional non-convex optimization problems
Stéphane dÁscoli
Maria Refinetti
Giulio Biroli
16
7
0
09 Feb 2022
Noether's Learning Dynamics: Role of Symmetry Breaking in Neural Networks
Hidenori Tanaka
D. Kunin
19
26
0
06 May 2021
The large learning rate phase of deep learning: the catapult mechanism
Aitor Lewkowycz
Yasaman Bahri
Ethan Dyer
Jascha Narain Sohl-Dickstein
Guy Gur-Ari
ODL
159
234
0
04 Mar 2020
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
231
4,460
0
23 Jan 2020
L4: Practical loss-based stepsize adaptation for deep learning
Michal Rolínek
Georg Martius
ODL
36
63
0
14 Feb 2018
1