A Second look at Exponential and Cosine Step Sizes: Simplicity, Adaptivity, and Performance

12 February 2020

Papers citing "A Second look at Exponential and Cosine Step Sizes: Simplicity, Adaptivity, and Performance"

5 / 5 papers shown

Title
Understanding AdamW through Proximal Methods and Scale-Freeness Zhenxun Zhuang Mingrui Liu Ashok Cutkosky Francesco Orabona 21 61 0 31 Jan 2022
On Uniform Boundedness Properties of SGD and its Momentum Variants Xiaoyu Wang M. Johansson 13 3 0 25 Jan 2022
A High Probability Analysis of Adaptive SGD with Momentum Xiaoyun Li Francesco Orabona 81 64 0 28 Jul 2020
Bag of Tricks for Image Classification with Convolutional Neural Networks Tong He Zhi-Li Zhang Hang Zhang Zhongyue Zhang Junyuan Xie Mu Li 216 1,398 0 04 Dec 2018
Linear Convergence of Gradient and Proximal-Gradient Methods Under the Polyak-Łojasiewicz Condition Hamed Karimi J. Nutini Mark W. Schmidt 119 1,194 0 16 Aug 2016