Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2007.13985
Cited By
Stochastic Normalized Gradient Descent with Momentum for Large-Batch Training
28 July 2020
Shen-Yi Zhao
Chang-Wei Shi
Yin-Peng Xie
Wu-Jun Li
ODL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Stochastic Normalized Gradient Descent with Momentum for Large-Batch Training"
8 / 8 papers shown
Title
Neural Optimizer Equation, Decay Function, and Learning Rate Schedule Joint Evolution
Brandon Morgan
Dean Frederick Hougen
ODL
18
0
0
10 Apr 2024
On the Optimal Batch Size for Byzantine-Robust Distributed Learning
Yi-Rui Yang
Chang-Wei Shi
Wu-Jun Li
FedML
AAML
6
0
0
23 May 2023
Revisiting Outer Optimization in Adversarial Training
Ali Dabouei
Fariborz Taherkhani
Sobhan Soleymani
Nasser M. Nasrabadi
AAML
17
4
0
02 Sep 2022
Automatic Clipping: Differentially Private Deep Learning Made Easier and Stronger
Zhiqi Bu
Yu-Xiang Wang
Sheng Zha
George Karypis
11
67
0
14 Jun 2022
DecentLaM: Decentralized Momentum SGD for Large-batch Deep Training
Kun Yuan
Yiming Chen
Xinmeng Huang
Yingya Zhang
Pan Pan
Yinghui Xu
W. Yin
MoE
48
60
0
24 Apr 2021
Descending through a Crowded Valley - Benchmarking Deep Learning Optimizers
Robin M. Schmidt
Frank Schneider
Philipp Hennig
ODL
32
161
0
03 Jul 2020
Global Momentum Compression for Sparse Communication in Distributed Learning
Chang-Wei Shi
Shen-Yi Zhao
Yin-Peng Xie
Hao Gao
Wu-Jun Li
17
1
0
30 May 2019
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
N. Keskar
Dheevatsa Mudigere
J. Nocedal
M. Smelyanskiy
P. T. P. Tang
ODL
273
2,888
0
15 Sep 2016
1