Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2006.15191
Cited By
Is SGD a Bayesian sampler? Well, almost
26 June 2020
Chris Mingard
Guillermo Valle Pérez
Joar Skalse
A. Louis
BDL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Is SGD a Bayesian sampler? Well, almost"
14 / 14 papers shown
Title
Variational Stochastic Gradient Descent for Deep Neural Networks
Haotian Chen
Anna Kuzina
Babak Esmaeili
Jakub M. Tomczak
52
0
0
09 Apr 2024
Bayesian Uncertainty Estimation by Hamiltonian Monte Carlo: Applications to Cardiac MRI Segmentation
Yidong Zhao
João Tourais
Iain Pierce
Christian Nitsche
T. Treibel
Sebastian Weingartner
Artur M. Schweidtmann
Qian Tao
BDL
UQCV
38
5
0
04 Mar 2024
Predictive Minds: LLMs As Atypical Active Inference Agents
Jan Kulveit
Clem von Stengel
Roman Leventov
LLMAG
KELM
LRM
44
1
0
16 Nov 2023
Points of non-linearity of functions generated by random neural networks
David Holmes
16
0
0
19 Apr 2023
Do deep neural networks have an inbuilt Occam's razor?
Chris Mingard
Henry Rees
Guillermo Valle Pérez
A. Louis
UQCV
BDL
21
15
0
13 Apr 2023
Investigating Generalization by Controlling Normalized Margin
Alexander R. Farhang
Jeremy Bernstein
Kushal Tirumala
Yang Liu
Yisong Yue
28
6
0
08 May 2022
Contrasting random and learned features in deep Bayesian linear regression
Jacob A. Zavatone-Veth
William L. Tong
C. Pehlevan
BDL
MLT
28
26
0
01 Mar 2022
Optimal learning rate schedules in high-dimensional non-convex optimization problems
Stéphane dÁscoli
Maria Refinetti
Giulio Biroli
16
7
0
09 Feb 2022
Separation of Scales and a Thermodynamic Description of Feature Learning in Some CNNs
Inbar Seroussi
Gadi Naveh
Z. Ringel
30
50
0
31 Dec 2021
Computing the Information Content of Trained Neural Networks
Jeremy Bernstein
Yisong Yue
19
4
0
01 Mar 2021
Predicting the outputs of finite deep neural networks trained with noisy gradients
Gadi Naveh
Oded Ben-David
H. Sompolinsky
Z. Ringel
11
20
0
02 Apr 2020
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
237
4,469
0
23 Jan 2020
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
N. Keskar
Dheevatsa Mudigere
J. Nocedal
M. Smelyanskiy
P. T. P. Tang
ODL
281
2,889
0
15 Sep 2016
The Loss Surfaces of Multilayer Networks
A. Choromańska
Mikael Henaff
Michaël Mathieu
Gerard Ben Arous
Yann LeCun
ODL
179
1,185
0
30 Nov 2014
1