Automatic Gradient Descent: Deep Learning without Hyperparameters

11 April 2023

Papers citing "Automatic Gradient Descent: Deep Learning without Hyperparameters"

4 / 4 papers shown

Title
Time Transfer: On Optimal Learning Rate and Batch Size In The Infinite Data Limit Oleg Filatov Jan Ebert Jiangtao Wang Stefan Kesselheim 36 3 0 10 Jan 2025
Infinite Limits of Multi-head Transformer Dynamics Blake Bordelon Hamza Tahir Chaudhry C. Pehlevan AI4CE 42 9 0 24 May 2024
High-Performance Large-Scale Image Recognition Without Normalization Andrew Brock Soham De Samuel L. Smith Karen Simonyan VLM 223 512 0 11 Feb 2021
On the distance between two neural networks and the stability of learning Jeremy Bernstein Arash Vahdat Yisong Yue Ming-Yu Liu ODL 190 57 0 09 Feb 2020