Pipe-SGD: A Decentralized Pipelined SGD Framework for Distributed Deep
Net TrainingNeural Information Processing Systems (NeurIPS), 2018 |
Parallel Restarted SGD with Faster Convergence and Less Communication:
Demystifying Why Model Averaging Works for Deep LearningAAAI Conference on Artificial Intelligence (AAAI), 2018 |
A Simple Stochastic Variance Reduced Algorithm with Fast Convergence
RatesInternational Conference on Machine Learning (ICML), 2018 |
VR-SGD: A Simple Stochastic Variance Reduction Method for Machine
LearningIEEE Transactions on Knowledge and Data Engineering (TKDE), 2018 |
Demystifying Parallel and Distributed Deep Learning: An In-Depth
Concurrency AnalysisACM Computing Surveys (CSUR), 2018 |
On Nonconvex Decentralized Gradient DescentIEEE Transactions on Signal Processing (IEEE TSP), 2016 |