v1v2 (latest)

Stochastic Gradient Descent as Approximate Bayesian Inference

13 April 2017

Matthew D. Hoffman

Papers citing "Stochastic Gradient Descent as Approximate Bayesian Inference"

50 / 327 papers shown

Title
Subspace Inference for Bayesian Deep Learning Pavel Izmailov Wesley J. Maddox Polina Kirichenko T. Garipov Dmitry Vetrov A. Wilson UQCV BDL 97 144 0 17 Jul 2019
Neural ODEs as the Deep Limit of ResNets with constant weights B. Avelin K. Nystrom ODL 141 32 0 28 Jun 2019
On the interplay between noise and curvature and its effect on optimization and generalization Valentin Thomas Fabian Pedregosa B. V. Merrienboer Pierre-Antoine Mangazol Yoshua Bengio Nicolas Le Roux 62 61 0 18 Jun 2019
On the Noisy Gradient Descent that Generalizes as SGD Jingfeng Wu Wenqing Hu Haoyi Xiong Jun Huan Vladimir Braverman Zhanxing Zhu MLT 73 10 0 18 Jun 2019
Adaptively Preconditioned Stochastic Gradient Langevin Dynamics C. A. Bhardwaj ODL 21 9 0 10 Jun 2019
Improving Neural Language Modeling via Adversarial Training Dilin Wang Chengyue Gong Qiang Liu AAML 115 119 0 10 Jun 2019
Practical Deep Learning with Bayesian Principles Kazuki Osawa S. Swaroop Anirudh Jain Runa Eschenhagen Richard Turner Rio Yokota Mohammad Emtiyaz Khan BDL UQCV 167 247 0 06 Jun 2019
Replica-exchange Nosé-Hoover dynamics for Bayesian learning on large datasets Rui Luo Qiang Zhang Yaodong Yang Jun Wang BDL 71 3 0 29 May 2019
Structural Equation Models as Computation Graphs E. V. Kesteren Daniel L. Oberski 15 1 0 11 May 2019
The Effect of Network Width on Stochastic Gradient Descent and Generalization: an Empirical Study Daniel S. Park Jascha Narain Sohl-Dickstein Quoc V. Le Samuel L. Smith 96 57 0 09 May 2019
A Generative Model for Sampling High-Performance and Diverse Weights for Neural Networks Lior Deutsch Erik Nijkamp Yu Yang 71 16 0 07 May 2019
Differentiable Visual Computing Tzu-Mao Li 54 15 0 27 Apr 2019
Some Limit Properties of Markov Chains Induced by Stochastic Recursive Algorithms Abhishek Gupta Hao Chen Jianzong Pi Gaurav Tendolkar 28 0 0 24 Apr 2019
Bayesian Neural Networks at Finite Temperature R. Baldock Nicola Marzari 40 3 0 08 Apr 2019
Data Augmentation for Bayesian Deep Learning YueXing Wang Nicholas G. Polson Vadim Sokolov UQCV BDL 85 5 0 22 Mar 2019
Theoretical guarantees for sampling and inference in generative models with latent diffusions Belinda Tzen Maxim Raginsky DiffM 76 102 0 05 Mar 2019
An Empirical Study of Large-Batch Stochastic Gradient Descent with Structured Covariance Noise Yeming Wen Kevin Luk Maxime Gazeau Guodong Zhang Harris Chan Jimmy Ba ODL 73 22 0 21 Feb 2019
Deep Multi-modal Object Detection and Semantic Segmentation for Autonomous Driving: Datasets, Methods, and Challenges Di Feng Christian Haase-Schuetz Lars Rosenbaum Heinz Hertlein Claudius Gläser Fabian Duffhauss W. Wiesbeck Klaus C. J. Dietmayer 3DPC 169 1,014 0 21 Feb 2019
Generalisation in fully-connected neural networks for time series forecasting Anastasia Borovykh C. Oosterlee S. Bohté OOD AI4TS 46 3 0 14 Feb 2019
A Simple Baseline for Bayesian Uncertainty in Deep Learning Wesley J. Maddox T. Garipov Pavel Izmailov Dmitry Vetrov A. Wilson BDL UQCV 140 810 0 07 Feb 2019
Stochastic Zeroth-order Discretizations of Langevin Diffusions for Bayesian Inference Abhishek Roy Lingqing Shen Krishnakumar Balasubramanian Saeed Ghadimi 94 6 0 04 Feb 2019
Quantitative Weak Convergence for Discrete Stochastic Processes Xiang Cheng Peter L. Bartlett Michael I. Jordan 36 5 0 03 Feb 2019
Uniform-in-Time Weak Error Analysis for Stochastic Gradient Descent Algorithms via Diffusion Approximation Yuanyuan Feng Tingran Gao Lei Li Jian‐Guo Liu Yulong Lu 63 25 0 02 Feb 2019
Quasi-potential as an implicit regularizer for the loss function in the stochastic gradient descent Wenqing Hu Zhanxing Zhu Haoyi Xiong Jun Huan MLT 54 10 0 18 Jan 2019
A continuous-time analysis of distributed stochastic gradient Nicholas M. Boffi Jean-Jacques E. Slotine 46 15 0 28 Dec 2018
An Empirical Model of Large-Batch Training Sam McCandlish Jared Kaplan Dario Amodei OpenAI Dota Team 76 280 0 14 Dec 2018
Towards Theoretical Understanding of Large Batch Training in Stochastic Gradient Descent Xiaowu Dai Yuhua Zhu 75 11 0 03 Dec 2018
Understanding the impact of entropy on policy optimization Zafarali Ahmed Nicolas Le Roux Mohammad Norouzi Dale Schuurmans 83 238 0 27 Nov 2018
Practical Bayesian Learning of Neural Networks via Adaptive Optimisation Methods Caroline Werther M. Ferguson K. Park Cuixian Chen Stephen J. Roberts ODL 21 4 0 08 Nov 2018
Stochastic Modified Equations and Dynamics of Stochastic Gradient Algorithms I: Mathematical Foundations Qianxiao Li Cheng Tai E. Weinan 124 150 0 05 Nov 2018
A Gaussian Process perspective on Convolutional Neural Networks Anastasia Borovykh 81 19 0 25 Oct 2018
Rate Distortion For Model Compression: From Theory To Practice Weihao Gao Yu-Han Liu Chong-Jun Wang Sewoong Oh 86 31 0 09 Oct 2018
Continuous-time Models for Stochastic Optimization Algorithms Antonio Orvieto Aurelien Lucchi 119 32 0 05 Oct 2018
Fluctuation-dissipation relations for stochastic gradient descent Sho Yaida 113 75 0 28 Sep 2018
Compositional Stochastic Average Gradient for Machine Learning and Related Applications Tsung-Yu Hsieh Y. El-Manzalawy Yiwei Sun Vasant Honavar 44 1 0 04 Sep 2018
Don't Use Large Mini-Batches, Use Local SGD Tao R. Lin Sebastian U. Stich Kumar Kshitij Patel Martin Jaggi 123 432 0 22 Aug 2018
Deep Convolutional Networks as shallow Gaussian Processes Adrià Garriga-Alonso C. Rasmussen Laurence Aitchison BDL UQCV 116 271 0 16 Aug 2018
Bayesian filtering unifies adaptive and non-adaptive neural network optimization methods Laurence Aitchison ODL 109 21 0 19 Jul 2018
TherML: Thermodynamics of Machine Learning Alexander A. Alemi Ian S. Fischer DRL AI4CE 58 29 0 11 Jul 2018
Quasi-Monte Carlo Variational Inference Alexander K. Buchholz F. Wenzel Stephan Mandt BDL 105 60 0 04 Jul 2018
Stochastic natural gradient descent draws posterior samples in function space Samuel L. Smith Daniel Duckworth Semon Rezchikov Quoc V. Le Jascha Narain Sohl-Dickstein BDL 85 6 0 25 Jun 2018
On the Spectral Bias of Neural Networks Nasim Rahaman A. Baratin Devansh Arpit Felix Dräxler Min Lin Fred Hamprecht Yoshua Bengio Aaron Courville 172 1,462 0 22 Jun 2018
Laplacian Smoothing Gradient Descent Stanley Osher Bao Wang Penghang Yin Xiyang Luo Farzin Barekat Minh Pham A. Lin ODL 113 43 0 17 Jun 2018
There Are Many Consistent Explanations of Unlabeled Data: Why You Should Average Ben Athiwaratkun Marc Finzi Pavel Izmailov A. Wilson 281 244 0 14 Jun 2018
Fast and Scalable Bayesian Deep Learning by Weight-Perturbation in Adam Mohammad Emtiyaz Khan Didrik Nielsen Voot Tangkaratt Wu Lin Y. Gal Akash Srivastava ODL 194 271 0 13 Jun 2018
Scalable Natural Gradient Langevin Dynamics in Practice Henri Palacci H. Hess BDL 30 8 0 07 Jun 2018
Deep learning generalizes because the parameter-function map is biased towards simple functions Guillermo Valle Pérez Chico Q. Camargo A. Louis MLT AI4CE 122 232 0 22 May 2018
Classifier-agnostic saliency map extraction Konrad Zolna Krzysztof J. Geras Kyunghyun Cho 78 29 0 21 May 2018
SmoothOut: Smoothing Out Sharp Minima to Improve Generalization in Deep Learning W. Wen Yandan Wang Feng Yan Cong Xu Chunpeng Wu Yiran Chen H. Li 79 51 0 21 May 2018
Gaussian Process Behaviour in Wide Deep Neural Networks A. G. Matthews Mark Rowland Jiri Hron Richard Turner Zoubin Ghahramani BDL 192 561 0 30 Apr 2018