v1v2v3 (latest)

On the Periodic Behavior of Neural Network Training with Batch Normalization and Weight Decay

Neural Information Processing Systems (NeurIPS), 2021

29 June 2021

Dmitry Vetrov

Papers citing "On the Periodic Behavior of Neural Network Training with Batch Normalization and Weight Decay"

18 / 18 papers shown

Can Training Dynamics of Scale-Invariant Neural Networks Be Explained by the Thermodynamics of an Ideal Gas?

Ildus Sadrtdinov

E. Lobacheva

Ivan Klimov

Mikhail I. Katsnelson

Dmitry Vetrov

AI4CE

246

10 Nov 2025

SGD as Free Energy Minimization: A Thermodynamic View on Neural Network Training

290

29 May 2025

Where Do Large Learning Rates Lead Us?Neural Information Processing Systems (NeurIPS), 2024

378

29 Oct 2024

Normalization and effective learning rates in reinforcement learning

Will Dabney

394

01 Jul 2024

Future Directions in the Theory of Graph Machine Learning

.Ismail .Ilkan Ceylan

700

03 Feb 2024

Large Learning Rates Improve Generalization: But How Large Are We Talking About?

249

19 Nov 2023

From Stability to Chaos: Analyzing Gradient Descent Dynamics in Quadratic Regression

Xuxing Chen

Krishnakumar Balasubramanian

Promit Ghosal

Bhavya Agrawalla

291

02 Oct 2023

Exploring Weight Balancing on Long-Tailed Recognition ProblemInternational Conference on Learning Representations (ICLR), 2023

Naoya Hasegawa

Issei Sato

768

26 May 2023

Adversarial Attacks and Defenses in Machine Learning-Powered Networks: A Contemporary Survey

351

11 Mar 2023

On the Training Instability of Shuffling SGD with Batch NormalizationInternational Conference on Machine Learning (ICML), 2023

David Wu

Chulhee Yun

S. Sra

516

24 Feb 2023

Batchless Normalization: How to Normalize Activations with just one Instance in Memory

Benjamin Berger

BDL

30 Dec 2022

Scale-invariant Bayesian Neural Networks with Connectivity Tangent KernelInternational Conference on Learning Representations (ICLR), 2022

224

30 Sep 2022

Training Scale-Invariant Neural Networks on the Sphere Can Happen in Three RegimesNeural Information Processing Systems (NeurIPS), 2022

M. Kodryan

E. Lobacheva

M. Nakhodnov

Dmitry Vetrov

372

08 Sep 2022

On the generalization of learning algorithms that do not convergeNeural Information Processing Systems (NeurIPS), 2022

421

16 Aug 2022

Adapting the Linearised Laplace Model Evidence for Modern Deep LearningInternational Conference on Machine Learning (ICML), 2022

Javier Antorán

José Miguel Hernández-Lobato

UQCV BDL

306

17 Jun 2022

Understanding the Generalization Benefit of Normalization Layers: Sharpness ReductionNeural Information Processing Systems (NeurIPS), 2022

474

14 Jun 2022

Robust Training of Neural Networks Using Scale Invariant ArchitecturesInternational Conference on Machine Learning (ICML), 2022

Srinadh Bhojanapalli

324

02 Feb 2022

Neural Network Weights Do Not Converge to Stationary Points: An Invariant Measure PerspectiveInternational Conference on Machine Learning (ICML), 2021

283

12 Oct 2021