v1v2 (latest)

Dynamical Isometry and a Mean Field Theory of LSTMs and GRUs

25 January 2019

Papers citing "Dynamical Isometry and a Mean Field Theory of LSTMs and GRUs"

30 / 30 papers shown

Title
Time-Scale Coupling Between States and Parameters in Recurrent Neural Networks Lorenzo Livi 144 1 0 16 Aug 2025
Revisiting Glorot Initialization for Long-Range Linear Recurrences Noga Bar Mariia Seleznova Yotam Alexander Gitta Kutyniok Raja Giryes 131 0 0 26 May 2025
Deep Neural Network Initialization with Sparsity Inducing Activations Ilan Price Nicholas Daultry Ball Samuel C.H. Lam Adam C. Jones Jared Tanner AI4CE 150 2 0 25 Feb 2024
Gradient Flossing: Improving Gradient Descent through Dynamic Control of Jacobians Rainer Engelken 176 10 0 28 Dec 2023
On the Neural Tangent Kernel of Equilibrium Models Zhili Feng J. Zico Kolter 185 8 0 21 Oct 2023
On the Initialisation of Wide Low-Rank Feedforward Neural Networks Thiziri Nait Saada Jared Tanner 131 2 0 31 Jan 2023
Statistical Physics of Deep Neural Networks: Initialization toward Optimal ChannelsPhysical Review Research (Phys. Rev. Res.), 2022 Kangyu Weng Aohua Cheng Ziyang Zhang Pei Sun Yang Tian 259 4 0 04 Dec 2022
Analysis of Convolutions, Non-linearity and Depth in Graph Neural Networks using Neural Tangent Kernel Mahalakshmi Sabanayagam Pascal Esser Debarghya Ghoshdastidar 285 3 0 18 Oct 2022
Random orthogonal additive filters: a solution to the vanishing/exploding gradient of deep neural networksIEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2022 Andrea Ceni ODL 112 10 0 03 Oct 2022
Generalizing Goal-Conditioned Reinforcement Learning with Variational Causal ReasoningNeural Information Processing Systems (NeurIPS), 2022 Wenhao Ding Haohong Lin Yue Liu Ding Zhao LRM 448 49 0 19 Jul 2022
Recency Dropout for Recurrent Recommender Systems Bo-Yu Chang Can Xu Matt Le Jingchen Feng Ya Le Sriraj Badam Ed H. Chi Minmin Chen 113 5 0 26 Jan 2022
The edge of chaos: quantum field theory and deep neural networksSciPost Physics (SciPost Phys.), 2021 Kevin T. Grosvenor R. Jefferson 156 26 0 27 Sep 2021
Towards quantifying information flows: relative entropy in deep neural networks and the renormalization groupSciPost Physics (SciPost Phys.), 2021 J. Erdmenger Kevin T. Grosvenor R. Jefferson 114 21 0 14 Jul 2021
Asymptotic Freeness of Layerwise Jacobians Caused by Invariance of Multilayer Perceptron: The Haar Orthogonal CaseCommunications in Mathematical Physics (Commun. Math. Phys.), 2021 B. Collins Tomohiro Hayase 177 8 0 24 Mar 2021
Feature Learning in Infinite-Width Neural Networks Greg Yang J. E. Hu MLT 352 179 0 30 Nov 2020
Beyond Signal Propagation: Is Feature Diversity Necessary in Deep Neural Network Initialization? Yaniv Blumenfeld D. Gilboa Daniel Soudry ODL 178 16 0 02 Jul 2020
On Lyapunov Exponents for RNNs: Understanding Information Propagation Using Dynamical Systems ToolsFrontiers in Applied Mathematics and Statistics (FAMS), 2020 Ryan H. Vogt M. P. Touzel Eli Shlizerman Guillaume Lajoie 189 49 0 25 Jun 2020
The Spectrum of Fisher Information of Deep Networks Achieving Dynamical IsometryInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2020 Tomohiro Hayase Ryo Karakida 252 9 0 14 Jun 2020
Dynamical mean-field theory for stochastic gradient descent in Gaussian mixture classificationNeural Information Processing Systems (NeurIPS), 2020 Francesca Mignacco Florent Krzakala Pierfrancesco Urbani Lenka Zdeborová MLT 280 73 0 10 Jun 2020
ReZero is All You Need: Fast Convergence at Large DepthConference on Uncertainty in Artificial Intelligence (UAI), 2020 Thomas C. Bachlechner Bodhisattwa Prasad Majumder H. H. Mao G. Cottrell Julian McAuley AI4CE 295 317 0 10 Mar 2020
Gating creates slow modes and controls phase-space complexity in GRUs and LSTMsMathematical and Scientific Machine Learning (MSML), 2020 T. Can K. Krishnamurthy D. Schwab AI4CE 345 20 0 31 Jan 2020
Provable Benefit of Orthogonal Initialization in Optimizing Deep Linear NetworksInternational Conference on Learning Representations (ICLR), 2020 Wei Hu Lechao Xiao Jeffrey Pennington 165 126 0 16 Jan 2020
Disentangling Trainability and Generalization in Deep Neural Networks Lechao Xiao Jeffrey Pennington S. Schoenholz 173 34 0 30 Dec 2019
Mean field theory for deep dropout networks: digging up gradient backpropagation deeplyEuropean Conference on Artificial Intelligence (ECAI), 2019 Wei Huang R. Xu Weitao Du Yutian Zeng Yunce Zhao 128 6 0 19 Dec 2019
Optimization for deep learning: theory and algorithms Tian Ding ODL 251 177 0 19 Dec 2019
One-Shot Pruning of Recurrent Neural Networks by Jacobian Spectrum EvaluationInternational Conference on Learning Representations (ICLR), 2019 Matthew Shunshi Zhang Bradly C. Stadie 104 34 0 30 Nov 2019
Mean-field inference methods for neural networks Marylou Gabrié AI4CE 307 34 0 03 Nov 2019
Deep Learning Theory Review: An Optimal Control and Dynamical Systems Perspective Guan-Horng Liu Evangelos A. Theodorou AI4CE 250 74 0 28 Aug 2019
A Mean Field Theory of Quantized Deep Networks: The Quantization-Depth Trade-OffNeural Information Processing Systems (NeurIPS), 2019 Yaniv Blumenfeld D. Gilboa Daniel Soudry MQ 185 14 0 03 Jun 2019
A Mean Field Theory of Batch Normalization Greg Yang Jeffrey Pennington Vinay Rao Jascha Narain Sohl-Dickstein S. Schoenholz 186 184 0 21 Feb 2019