Is SGD a Bayesian sampler? Well, almost

Is SGD a Bayesian sampler? Well, almost

26 June 2020

Guillermo Valle Pérez

Papers citing "Is SGD a Bayesian sampler? Well, almost"

14 / 14 papers shown

Title
Variational Stochastic Gradient Descent for Deep Neural Networks Haotian Chen Anna Kuzina Babak Esmaeili Jakub M. Tomczak 52 0 0 09 Apr 2024
Bayesian Uncertainty Estimation by Hamiltonian Monte Carlo: Applications to Cardiac MRI Segmentation Yidong Zhao João Tourais Iain Pierce Christian Nitsche T. Treibel Sebastian Weingartner Artur M. Schweidtmann Qian Tao BDL UQCV 38 5 0 04 Mar 2024
Predictive Minds: LLMs As Atypical Active Inference Agents Jan Kulveit Clem von Stengel Roman Leventov LLMAG KELM LRM 44 1 0 16 Nov 2023
Points of non-linearity of functions generated by random neural networks David Holmes 16 0 0 19 Apr 2023
Do deep neural networks have an inbuilt Occam's razor? Chris Mingard Henry Rees Guillermo Valle Pérez A. Louis UQCV BDL 21 15 0 13 Apr 2023
Investigating Generalization by Controlling Normalized Margin Alexander R. Farhang Jeremy Bernstein Kushal Tirumala Yang Liu Yisong Yue 28 6 0 08 May 2022
Contrasting random and learned features in deep Bayesian linear regression Jacob A. Zavatone-Veth William L. Tong C. Pehlevan BDL MLT 28 26 0 01 Mar 2022
Optimal learning rate schedules in high-dimensional non-convex optimization problems Stéphane dÁscoli Maria Refinetti Giulio Biroli 16 7 0 09 Feb 2022
Separation of Scales and a Thermodynamic Description of Feature Learning in Some CNNs Inbar Seroussi Gadi Naveh Z. Ringel 30 50 0 31 Dec 2021
Computing the Information Content of Trained Neural Networks Jeremy Bernstein Yisong Yue 19 4 0 01 Mar 2021
Predicting the outputs of finite deep neural networks trained with noisy gradients Gadi Naveh Oded Ben-David H. Sompolinsky Z. Ringel 11 20 0 02 Apr 2020
Scaling Laws for Neural Language Models Jared Kaplan Sam McCandlish T. Henighan Tom B. Brown B. Chess R. Child Scott Gray Alec Radford Jeff Wu Dario Amodei 237 4,469 0 23 Jan 2020
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima N. Keskar Dheevatsa Mudigere J. Nocedal M. Smelyanskiy P. T. P. Tang ODL 281 2,889 0 15 Sep 2016
The Loss Surfaces of Multilayer Networks A. Choromańska Mikael Henaff Michaël Mathieu Gerard Ben Arous Yann LeCun ODL 179 1,185 0 30 Nov 2014