How Neural Networks Learn the Support is an Implicit Regularization Effect of SGD

17 June 2024

Papers citing "How Neural Networks Learn the Support is an Implicit Regularization Effect of SGD"

6 / 6 papers shown

Title
The Fair Language Model Paradox Andrea Pinto Tomer Galanti Randall Balestriero 13 0 0 15 Oct 2024
SGD learning on neural networks: leap complexity and saddle-to-saddle dynamics Emmanuel Abbe Enric Boix-Adserà Theodor Misiakiewicz FedML MLT 76 72 0 21 Feb 2023
Learning Single-Index Models with Shallow Neural Networks A. Bietti Joan Bruna Clayton Sanford M. Song 150 65 0 27 Oct 2022
What Happens after SGD Reaches Zero Loss? --A Mathematical Framework Zhiyuan Li Tianhao Wang Sanjeev Arora MLT 83 98 0 13 Oct 2021
Catastrophic Fisher Explosion: Early Phase Fisher Matrix Impacts Generalization Stanislaw Jastrzebski Devansh Arpit Oliver Åstrand Giancarlo Kerg Huan Wang Caiming Xiong R. Socher Kyunghyun Cho Krzysztof J. Geras AI4CE 177 64 0 28 Dec 2020
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima N. Keskar Dheevatsa Mudigere J. Nocedal M. Smelyanskiy P. T. P. Tang ODL 273 2,696 0 15 Sep 2016