Scheduled Restart Momentum for Accelerated Stochastic Gradient DescentSIAM Journal of Imaging Sciences (SIIMS), 2020 |
Understanding the Role of Momentum in Stochastic Gradient MethodsNeural Information Processing Systems (NeurIPS), 2019 |
Demon: Improved Neural Network Training with Momentum DecayIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2019 |
Adaptive Weight Decay for Deep Neural NetworksIEEE Access (IEEE Access), 2019 |
Which Algorithmic Choices Matter at Which Batch Sizes? Insights From a
Noisy Quadratic ModelNeural Information Processing Systems (NeurIPS), 2019 |
The Role of Memory in Stochastic OptimizationConference on Uncertainty in Artificial Intelligence (UAI), 2019 |
The Step Decay Schedule: A Near Optimal, Geometrically Decaying Learning
Rate Procedure For Least SquaresNeural Information Processing Systems (NeurIPS), 2019 |
Measuring the Effects of Data Parallelism on Neural Network TrainingJournal of machine learning research (JMLR), 2018 |