Stochastic Markov Gradient Descent and Training Low-Bit Neural Networks

25 August 2020

Abstract

The massive size of modern neural networks has motivated substantial recent interest in neural network quantization. We introduce Stochastic Markov Gradient Descent (SMGD), a discrete optimization method applicable to training quantized neural networks. The SMGD algorithm is designed for settings where memory is highly constrained during training. We provide theoretical guarantees of algorithm performance as well as encouraging numerical results.

View on arXiv

Comments on this paper