ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2006.09092
  4. Cited By
Learning Rates as a Function of Batch Size: A Random Matrix Theory
  Approach to Neural Network Training

Learning Rates as a Function of Batch Size: A Random Matrix Theory Approach to Neural Network Training

16 June 2020
Diego Granziol
S. Zohren
Stephen J. Roberts
    ODL
ArXivPDFHTML

Papers citing "Learning Rates as a Function of Batch Size: A Random Matrix Theory Approach to Neural Network Training"

10 / 10 papers shown
Title
Time Transfer: On Optimal Learning Rate and Batch Size In The Infinite Data Limit
Time Transfer: On Optimal Learning Rate and Batch Size In The Infinite Data Limit
Oleg Filatov
Jan Ebert
Jiangtao Wang
Stefan Kesselheim
36
3
0
10 Jan 2025
A Cost-Aware Approach to Adversarial Robustness in Neural Networks
A Cost-Aware Approach to Adversarial Robustness in Neural Networks
Charles Meyers
Mohammad Reza Saleh Sedghpour
Tommy Löfstedt
Erik Elmroth
OOD
AAML
31
0
0
11 Sep 2024
OpenChat: Advancing Open-source Language Models with Mixed-Quality Data
OpenChat: Advancing Open-source Language Models with Mixed-Quality Data
Guan-Bo Wang
Sijie Cheng
Xianyuan Zhan
Xiangang Li
Sen Song
Yang Liu
ALM
13
227
0
20 Sep 2023
Universal characteristics of deep neural network loss surfaces from
  random matrix theory
Universal characteristics of deep neural network loss surfaces from random matrix theory
Nicholas P. Baskerville
J. Keating
F. Mezzadri
J. Najnudel
Diego Granziol
22
4
0
17 May 2022
The large learning rate phase of deep learning: the catapult mechanism
The large learning rate phase of deep learning: the catapult mechanism
Aitor Lewkowycz
Yasaman Bahri
Ethan Dyer
Jascha Narain Sohl-Dickstein
Guy Gur-Ari
ODL
153
233
0
04 Mar 2020
Cleaning large correlation matrices: tools from random matrix theory
Cleaning large correlation matrices: tools from random matrix theory
J. Bun
J. Bouchaud
M. Potters
27
262
0
25 Oct 2016
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp
  Minima
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
N. Keskar
Dheevatsa Mudigere
J. Nocedal
M. Smelyanskiy
P. T. P. Tang
ODL
273
2,886
0
15 Sep 2016
The Loss Surfaces of Multilayer Networks
The Loss Surfaces of Multilayer Networks
A. Choromańska
Mikael Henaff
Michaël Mathieu
Gerard Ben Arous
Yann LeCun
ODL
175
1,184
0
30 Nov 2014
A simpler approach to obtaining an O(1/t) convergence rate for the
  projected stochastic subgradient method
A simpler approach to obtaining an O(1/t) convergence rate for the projected stochastic subgradient method
Simon Lacoste-Julien
Mark W. Schmidt
Francis R. Bach
116
259
0
10 Dec 2012
Stochastic Gradient Descent for Non-smooth Optimization: Convergence
  Results and Optimal Averaging Schemes
Stochastic Gradient Descent for Non-smooth Optimization: Convergence Results and Optimal Averaging Schemes
Ohad Shamir
Tong Zhang
99
571
0
08 Dec 2012
1