ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2105.12221
28
91

Geometry of the Loss Landscape in Overparameterized Neural Networks: Symmetries and Invariances

25 May 2021
Berfin cSimcsek
François Ged
Arthur Jacot
Francesco Spadaro
Clément Hongler
W. Gerstner
Johanni Brea
    AI4CE
ArXivPDFHTML
Abstract

We study how permutation symmetries in overparameterized multi-layer neural networks generate `symmetry-induced' critical points. Assuming a network with L L L layers of minimal widths r1∗,…,rL−1∗ r_1^*, \ldots, r_{L-1}^* r1∗​,…,rL−1∗​ reaches a zero-loss minimum at r1∗!⋯rL−1∗! r_1^*! \cdots r_{L-1}^*! r1∗​!⋯rL−1∗​! isolated points that are permutations of one another, we show that adding one extra neuron to each layer is sufficient to connect all these previously discrete minima into a single manifold. For a two-layer overparameterized network of width r∗+h=:m r^*+ h =: m r∗+h=:m we explicitly describe the manifold of global minima: it consists of T(r∗,m) T(r^*, m) T(r∗,m) affine subspaces of dimension at least h h h that are connected to one another. For a network of width mmm, we identify the number G(r,m)G(r,m)G(r,m) of affine subspaces containing only symmetry-induced critical points that are related to the critical points of a smaller network of width r<r∗r<r^*r<r∗. Via a combinatorial analysis, we derive closed-form formulas for T T T and G G G and show that the number of symmetry-induced critical subspaces dominates the number of affine subspaces forming the global minima manifold in the mildly overparameterized regime (small h h h) and vice versa in the vastly overparameterized regime (h≫r∗h \gg r^*h≫r∗). Our results provide new insights into the minimization of the non-convex loss function of overparameterized neural networks.

View on arXiv
Comments on this paper