Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1905.11286
Cited By
Stochastic Gradient Methods with Layer-wise Adaptive Moments for Training of Deep Networks
27 May 2019
Boris Ginsburg
P. Castonguay
Oleksii Hrinchuk
Oleksii Kuchaiev
Vitaly Lavrukhin
Ryan Leary
Jason Chun Lok Li
Huyen Nguyen
Yang Zhang
Jonathan M. Cohen
ODL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Stochastic Gradient Methods with Layer-wise Adaptive Moments for Training of Deep Networks"
3 / 3 papers shown
Title
Low-Pass Filtering SGD for Recovering Flat Optima in the Deep Learning Optimization Landscape
Devansh Bisla
Jing Wang
A. Choromańska
25
34
0
20 Jan 2022
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
M. Shoeybi
M. Patwary
Raul Puri
P. LeGresley
Jared Casper
Bryan Catanzaro
MoE
245
1,821
0
17 Sep 2019
Bag of Tricks for Image Classification with Convolutional Neural Networks
Tong He
Zhi-Li Zhang
Hang Zhang
Zhongyue Zhang
Junyuan Xie
Mu Li
221
1,399
0
04 Dec 2018
1