19
0

Towards the Three-Phase Dynamics of Generalization Power of a DNN

Abstract

This paper proposes a new perspective for analyzing the generalization power of deep neural networks (DNNs), i.e., directly disentangling and analyzing the dynamics of generalizable and non-generalizable interaction encoded by a DNN through the training process. Specifically, this work builds upon the recent theoretical achievement in explainble AI, which proves that the detailed inference logic of DNNs can be can be strictly rewritten as a small number of AND-OR interaction patterns. Based on this, we propose an efficient method to quantify the generalization power of each interaction, and we discover a distinct three-phase dynamics of the generalization power of interactions during training. In particular, the early phase of training typically removes noisy and non-generalizable interactions and learns simple and generalizable ones. The second and the third phases tend to capture increasingly complex interactions that are harder to generalize. Experimental results verify that the learning of non-generalizable interactions is the the direct cause for the gap between the training and testing losses.

View on arXiv
@article{he2025_2505.06993,
  title={ Towards the Three-Phase Dynamics of Generalization Power of a DNN },
  author={ Yuxuan He and Junpeng Zhang and Hongyuan Zhang and Quanshi Zhang },
  journal={arXiv preprint arXiv:2505.06993},
  year={ 2025 }
}
Comments on this paper