On the generalization of learning algorithms that do not convergeNeural Information Processing Systems (NeurIPS), 2022 |
Understanding the Generalization Benefit of Normalization Layers:
Sharpness ReductionNeural Information Processing Systems (NeurIPS), 2022 |
Chaotic Regularization and Heavy-Tailed Limits for Deterministic
Gradient DescentNeural Information Processing Systems (NeurIPS), 2022 |
Beyond the Quadratic Approximation: the Multiscale Structure of Neural
Network Loss LandscapesJournal of Machine Learning (JML), 2022 |
A novel multi-scale loss function for classification problems in machine
learningJournal of Computational Physics (JCP), 2021 |