Forget the Learning Rate, Decay Loss

Abstract
In the usual deep neural network optimization process, the learning rate is the most important hyper parameter, which greatly affects the final convergence effect. The purpose of learning rate is to control the stepsize and gradually reduce the impact of noise on the network. In this paper, we will use a fixed learning rate with method of decaying loss to control the magnitude of the update. We used Image classification, Semantic segmentation, and GANs to verify this method. Experiments show that the loss decay strategy can greatly improve the performance of the model
View on arXivComments on this paper