ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2506.13234
7
0

The Butterfly Effect: Neural Network Training Trajectories Are Highly Sensitive to Initial Conditions

16 June 2025
Devin Kwok
Gül Sena Altıntaş
Colin Raffel
David Rolnick
ArXiv (abs)PDFHTML
Main:9 Pages
30 Figures
Bibliography:5 Pages
6 Tables
Appendix:15 Pages
Abstract

Neural network training is inherently sensitive to initialization and the randomness induced by stochastic gradient descent. However, it is unclear to what extent such effects lead to meaningfully different networks, either in terms of the models' weights or the underlying functions that were learned. In this work, we show that during the initial "chaotic" phase of training, even extremely small perturbations reliably causes otherwise identical training trajectories to diverge-an effect that diminishes rapidly over training time. We quantify this divergence through (i) L2L^2L2 distance between parameters, (ii) the loss barrier when interpolating between networks, (iii) L2L^2L2 and barrier between parameters after permutation alignment, and (iv) representational similarity between intermediate activations; revealing how perturbations across different hyperparameter or fine-tuning settings drive training trajectories toward distinct loss minima. Our findings provide insights into neural network training stability, with practical implications for fine-tuning, model merging, and diversity of model ensembles.

View on arXiv
@article{kwok2025_2506.13234,
  title={ The Butterfly Effect: Neural Network Training Trajectories Are Highly Sensitive to Initial Conditions },
  author={ Devin Kwok and Gül Sena Altıntaş and Colin Raffel and David Rolnick },
  journal={arXiv preprint arXiv:2506.13234},
  year={ 2025 }
}
Comments on this paper