Universal Statistics of Fisher Information in Deep Neural Networks: Mean Field Approach

4 June 2018

Papers citing "Universal Statistics of Fisher Information in Deep Neural Networks: Mean Field Approach"

46 / 96 papers shown

Title
Towards NNGP-guided Neural Architecture Search Daniel S. Park Jaehoon Lee Daiyi Peng Yuan Cao Jascha Narain Sohl-Dickstein BDL 18 32 0 11 Nov 2020
The power of quantum neural networks Amira Abbas David Sutter Christa Zoufal Aurelien Lucchi Alessio Figalli Stefan Woerner 16 725 0 30 Oct 2020
Dissecting Hessian: Understanding Common Structure of Hessian in Neural Networks Yikai Wu Xingyu Zhu Chenwei Wu Annie Wang Rong Ge 16 42 0 08 Oct 2020
Understanding Approximate Fisher Information for Fast Convergence of Natural Gradient Descent in Wide Neural Networks Ryo Karakida Kazuki Osawa 14 25 0 02 Oct 2020
Implicit Gradient Regularization David Barrett Benoit Dherin 14 146 0 23 Sep 2020
Traces of Class/Cross-Class Structure Pervade Deep Learning Spectra V. Papyan 14 76 0 27 Aug 2020
Implicit Regularization via Neural Feature Alignment A. Baratin Thomas George César Laurent R. Devon Hjelm Guillaume Lajoie Pascal Vincent Simon Lacoste-Julien 18 6 0 03 Aug 2020
Finite Versus Infinite Neural Networks: an Empirical Study Jaehoon Lee S. Schoenholz Jeffrey Pennington Ben Adlam Lechao Xiao Roman Novak Jascha Narain Sohl-Dickstein 17 207 0 31 Jul 2020
Bayesian Deep Ensembles via the Neural Tangent Kernel Bobby He Balaji Lakshminarayanan Yee Whye Teh BDL UQCV 9 116 0 11 Jul 2020
When Does Preconditioning Help or Hurt Generalization? S. Amari Jimmy Ba Roger C. Grosse Xuechen Li Atsushi Nitanda Taiji Suzuki Denny Wu Ji Xu 34 32 0 18 Jun 2020
The Spectrum of Fisher Information of Deep Networks Achieving Dynamical Isometry Tomohiro Hayase Ryo Karakida 27 7 0 14 Jun 2020
Non-convergence of stochastic gradient descent in the training of deep neural networks Patrick Cheridito Arnulf Jentzen Florian Rossmannek 14 37 0 12 Jun 2020
Spectra of the Conjugate Kernel and Neural Tangent Kernel for linear-width neural networks Z. Fan Zhichao Wang 44 71 0 25 May 2020
Using a thousand optimization tasks to learn hyperparameter search strategies Luke Metz Niru Maheswaranathan Ruoxi Sun C. Freeman Ben Poole Jascha Narain Sohl-Dickstein 20 45 0 27 Feb 2020
Layer-wise Conditioning Analysis in Exploring the Learning Dynamics of DNNs Lei Huang Jie Qin Li Liu Fan Zhu Ling Shao AI4CE 28 11 0 25 Feb 2020
Towards Query-Efficient Black-Box Adversary with Zeroth-Order Natural Gradient Descent Pu Zhao Pin-Yu Chen Siyue Wang X. Lin AAML 8 36 0 18 Feb 2020
On the infinite width limit of neural networks with a standard parameterization Jascha Narain Sohl-Dickstein Roman Novak S. Schoenholz Jaehoon Lee 24 47 0 21 Jan 2020
Any Target Function Exists in a Neighborhood of Any Sufficiently Wide Random Network: A Geometrical Perspective S. Amari 19 12 0 20 Jan 2020
Disentangling Trainability and Generalization in Deep Neural Networks Lechao Xiao Jeffrey Pennington S. Schoenholz 6 34 0 30 Dec 2019
Neural Tangents: Fast and Easy Infinite Neural Networks in Python Roman Novak Lechao Xiao Jiri Hron Jaehoon Lee Alexander A. Alemi Jascha Narain Sohl-Dickstein S. Schoenholz 27 224 0 05 Dec 2019
Information-Theoretic Local Minima Characterization and Regularization Zhiwei Jia Hao Su 21 19 0 19 Nov 2019
Sparsification as a Remedy for Staleness in Distributed Asynchronous SGD Rosa Candela Giulio Franzese Maurizio Filippone Pietro Michiardi 15 1 0 21 Oct 2019
Neural Spectrum Alignment: Empirical Study Dmitry Kopitkov Vadim Indelman 27 14 0 19 Oct 2019
Pathological spectra of the Fisher information metric and its variants in deep neural networks Ryo Karakida S. Akaho S. Amari 17 27 0 14 Oct 2019
The asymptotic spectrum of the Hessian of DNN throughout training Arthur Jacot Franck Gabriel Clément Hongler 11 34 0 01 Oct 2019
Deep Learning Theory Review: An Optimal Control and Dynamical Systems Perspective Guan-Horng Liu Evangelos A. Theodorou AI4CE 11 71 0 28 Aug 2019
A Fine-Grained Spectral Perspective on Neural Networks Greg Yang Hadi Salman 22 110 0 24 Jul 2019
Deep network as memory space: complexity, generalization, disentangled representation and interpretability X. Dong L. Zhou 23 1 0 12 Jul 2019
Order and Chaos: NTK views on DNN Normalization, Checkerboard and Boundary Artifacts Arthur Jacot Franck Gabriel François Ged Clément Hongler 11 23 0 11 Jul 2019
The Normalization Method for Alleviating Pathological Sharpness in Wide Neural Networks Ryo Karakida S. Akaho S. Amari 21 39 0 07 Jun 2019
Exact Convergence Rates of the Neural Tangent Kernel in the Large Depth Limit Soufiane Hayou Arnaud Doucet Judith Rousseau 16 4 0 31 May 2019
Limitations of the Empirical Fisher Approximation for Natural Gradient Descent Frederik Kunstner Lukas Balles Philipp Hennig 21 207 0 29 May 2019
A Geometric Modeling of Occam's Razor in Deep Learning Ke Sun Frank Nielsen 11 4 0 27 May 2019
The Effect of Network Width on Stochastic Gradient Descent and Generalization: an Empirical Study Daniel S. Park Jascha Narain Sohl-Dickstein Quoc V. Le Samuel L. Smith 14 57 0 09 May 2019
Mean-field Analysis of Batch Normalization Ming-Bo Wei J. Stokes D. Schwab MLT 25 8 0 06 Mar 2019
Scaling Limits of Wide Neural Networks with Weight Sharing: Gaussian Process Behavior, Gradient Independence, and Neural Tangent Kernel Derivation Greg Yang 11 282 0 13 Feb 2019
Large-Scale Distributed Second-Order Optimization Using Kronecker-Factored Approximate Curvature for Deep Convolutional Neural Networks Kazuki Osawa Yohei Tsuji Yuichiro Ueno Akira Naruse Rio Yokota Satoshi Matsuoka ODL 28 95 0 29 Nov 2018
Measuring the Effects of Data Parallelism on Neural Network Training Christopher J. Shallue Jaehoon Lee J. Antognini J. Mamou J. Ketterling Yao Wang 35 407 0 08 Nov 2018
Information Geometry of Orthogonal Initializations and Training Piotr A. Sokól Il-Su Park AI4CE 72 16 0 09 Oct 2018
Fisher Information and Natural Gradient Learning of Random Deep Networks S. Amari Ryo Karakida Masafumi Oizumi 14 34 0 22 Aug 2018
Neural Tangent Kernel: Convergence and Generalization in Neural Networks Arthur Jacot Franck Gabriel Clément Hongler 14 3,098 0 20 Jun 2018
Dynamical Isometry and a Mean Field Theory of RNNs: Gating Enables Signal Propagation in Recurrent Neural Networks Minmin Chen Jeffrey Pennington S. Schoenholz SyDa AI4CE 6 114 0 14 Jun 2018
Dynamical Isometry and a Mean Field Theory of CNNs: How to Train 10,000-Layer Vanilla Convolutional Neural Networks Lechao Xiao Yasaman Bahri Jascha Narain Sohl-Dickstein S. Schoenholz Jeffrey Pennington 227 348 0 14 Jun 2018
Fisher-Rao Metric, Geometry, and Complexity of Neural Networks Tengyuan Liang T. Poggio Alexander Rakhlin J. Stokes 25 224 0 05 Nov 2017
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima N. Keskar Dheevatsa Mudigere J. Nocedal M. Smelyanskiy P. T. P. Tang ODL 284 2,889 0 15 Sep 2016
Norm-Based Capacity Control in Neural Networks Behnam Neyshabur Ryota Tomioka Nathan Srebro 119 577 0 27 Feb 2015