137

Multiple Descents in Deep Learning as a Sequence of Order-Chaos Transitions

Main:9 Pages
7 Figures
1 Tables
Abstract

We observe a novel 'multiple-descent' phenomenon during the training process of LSTM, in which the test loss goes through long cycles of up and down trend multiple times after the model is overtrained. By carrying out asymptotic stability analysis of the models, we found that the cycles in test loss are closely associated with the phase transition process between order and chaos, and the local optimal epochs are consistently at the critical transition point between the two phases. More importantly, the global optimal epoch occurs at the first transition from order to chaos, where the 'width' of the édge of chaos' is the widest, allowing the best exploration of better weight configurations for learning.

View on arXiv
Comments on this paper