v1v2 (latest)

Dynamical Isometry and a Mean Field Theory of LSTMs and GRUs

25 January 2019

Papers citing "Dynamical Isometry and a Mean Field Theory of LSTMs and GRUs"

30 / 30 papers shown

Time-Scale Coupling Between States and Parameters in Recurrent Neural Networks

Lorenzo Livi

211

16 Aug 2025

Revisiting Glorot Initialization for Long-Range Linear Recurrences

172

26 May 2025

Deep Neural Network Initialization with Sparsity Inducing Activations

Ilan Price

Nicholas Daultry Ball

189

25 Feb 2024

Gradient Flossing: Improving Gradient Descent through Dynamic Control of Jacobians

Rainer Engelken

233

28 Dec 2023

On the Neural Tangent Kernel of Equilibrium Models

Zhili Feng

J. Zico Kolter

221

21 Oct 2023

On the Initialisation of Wide Low-Rank Feedforward Neural Networks

Thiziri Nait Saada

Jared Tanner

179

31 Jan 2023

Statistical Physics of Deep Neural Networks: Initialization toward Optimal ChannelsPhysical Review Research (Phys. Rev. Res.), 2022

283

04 Dec 2022

Analysis of Convolutions, Non-linearity and Depth in Graph Neural Networks using Neural Tangent Kernel

Mahalakshmi Sabanayagam

Pascal Esser

Debarghya Ghoshdastidar

382

18 Oct 2022

Random orthogonal additive filters: a solution to the vanishing/exploding gradient of deep neural networksIEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2022

Andrea Ceni

ODL

156

03 Oct 2022

Generalizing Goal-Conditioned Reinforcement Learning with Variational Causal ReasoningNeural Information Processing Systems (NeurIPS), 2022

Ding Zhao

545

19 Jul 2022

Recency Dropout for Recurrent Recommender Systems

122

26 Jan 2022

The edge of chaos: quantum field theory and deep neural networksSciPost Physics (SciPost Phys.), 2021

Kevin T. Grosvenor

R. Jefferson

212

27 Sep 2021

Towards quantifying information flows: relative entropy in deep neural networks and the renormalization groupSciPost Physics (SciPost Phys.), 2021

J. Erdmenger

Kevin T. Grosvenor

R. Jefferson

169

14 Jul 2021

Asymptotic Freeness of Layerwise Jacobians Caused by Invariance of Multilayer Perceptron: The Haar Orthogonal CaseCommunications in Mathematical Physics (Commun. Math. Phys.), 2021

B. Collins

Tomohiro Hayase

243

24 Mar 2021

Feature Learning in Infinite-Width Neural Networks

Greg Yang

J. E. Hu

MLT

422

181

30 Nov 2020

Beyond Signal Propagation: Is Feature Diversity Necessary in Deep Neural Network Initialization?

199

02 Jul 2020

On Lyapunov Exponents for RNNs: Understanding Information Propagation Using Dynamical Systems ToolsFrontiers in Applied Mathematics and Statistics (FAMS), 2020

213

25 Jun 2020

The Spectrum of Fisher Information of Deep Networks Achieving Dynamical IsometryInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2020

Tomohiro Hayase

Ryo Karakida

298

14 Jun 2020

Dynamical mean-field theory for stochastic gradient descent in Gaussian mixture classificationNeural Information Processing Systems (NeurIPS), 2020

Lenka Zdeborová

346

10 Jun 2020

ReZero is All You Need: Fast Convergence at Large DepthConference on Uncertainty in Artificial Intelligence (UAI), 2020

Thomas C. Bachlechner

Bodhisattwa Prasad Majumder

379

329

10 Mar 2020

Gating creates slow modes and controls phase-space complexity in GRUs and LSTMsMathematical and Scientific Machine Learning (MSML), 2020

377

31 Jan 2020

Provable Benefit of Orthogonal Initialization in Optimizing Deep Linear NetworksInternational Conference on Learning Representations (ICLR), 2020

Wei Hu

Lechao Xiao

Jeffrey Pennington

210

129

16 Jan 2020

Disentangling Trainability and Generalization in Deep Neural Networks

Lechao Xiao

Jeffrey Pennington

S. Schoenholz

199

30 Dec 2019

Mean field theory for deep dropout networks: digging up gradient backpropagation deeplyEuropean Conference on Artificial Intelligence (ECAI), 2019

Wei Huang

157

19 Dec 2019

Optimization for deep learning: theory and algorithms

Tian Ding

ODL

340

178

19 Dec 2019

One-Shot Pruning of Recurrent Neural Networks by Jacobian Spectrum EvaluationInternational Conference on Learning Representations (ICLR), 2019

Matthew Shunshi Zhang

Bradly C. Stadie

121

30 Nov 2019

Mean-field inference methods for neural networks

Marylou Gabrié

AI4CE

360

03 Nov 2019

Deep Learning Theory Review: An Optimal Control and Dynamical Systems Perspective

Guan-Horng Liu

Evangelos A. Theodorou

AI4CE

300

28 Aug 2019

A Mean Field Theory of Quantized Deep Networks: The Quantization-Depth Trade-OffNeural Information Processing Systems (NeurIPS), 2019

Yaniv Blumenfeld

D. Gilboa

Daniel Soudry

195

03 Jun 2019

A Mean Field Theory of Batch Normalization

Greg Yang

Jeffrey Pennington

Vinay Rao

Jascha Narain Sohl-Dickstein

S. Schoenholz

196

185

21 Feb 2019