Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1712.08969
Cited By
Mean Field Residual Networks: On the Edge of Chaos
Neural Information Processing Systems (NeurIPS), 2017
24 December 2017
Greg Yang
S. Schoenholz
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Mean Field Residual Networks: On the Edge of Chaos"
50 / 130 papers shown
Integral Signatures of Activation Functions: A 9-Dimensional Taxonomy and Stability Theory for Deep Learning
Ankur Mali
Lawrence Hall
Jake Williams
Gordon Richards
103
0
0
09 Oct 2025
Arithmetic-Mean
μ
μ
μ
P for Modern Architectures: A Unified Learning-Rate Scale for CNNs and ResNets
Haosong Zhang
Shenxi Wu
Yichi Zhang
Wei Lin
W. Lin
123
0
0
05 Oct 2025
Toward a Physics of Deep Learning and Brains
Arsham Ghavasieh
Meritxell Vila-Minana
Akanksha Khurd
John Beggs
Gerardo Ortiz
Santo Fortunato
64
1
0
26 Sep 2025
ResNets Are Deeper Than You Think
Christian H.X. Ali Mehmeti-Göpel
Michael Wand
184
1
0
17 Jun 2025
Is Random Attention Sufficient for Sequence Modeling? Disentangling Trainable Components in the Transformer
Yihe Dong
Lorenzo Noci
Mikhail Khodak
Mufan Li
441
1
0
01 Jun 2025
Two failure modes of deep transformers and how to avoid them: a unified theory of signal propagation at initialisation
Alessio Giorlandino
Sebastian Goldt
283
4
0
30 May 2025
GradAlign for Training-free Model Performance Inference
Yuxuan Li
Yunhui Guo
264
0
0
29 Nov 2024
Generalized Probabilistic Attention Mechanism in Transformers
DongNyeong Heo
Heeyoul Choi
277
3
0
21 Oct 2024
Collective variables of neural networks: empirical time evolution and scaling laws
S. Tovey
Sven Krippendorf
M. Spannowsky
Konstantin Nikolaou
Christian Holm
196
2
0
09 Oct 2024
UnitNorm: Rethinking Normalization for Transformers in Time Series
Nan Huang
C. Kümmerle
Xiang Zhang
AI4TS
238
4
0
24 May 2024
Principled Architecture-aware Scaling of Hyperparameters
Wuyang Chen
Junru Wu
Zhangyang Wang
Boris Hanin
AI4CE
298
2
0
27 Feb 2024
Deep Neural Network Initialization with Sparsity Inducing Activations
Ilan Price
Nicholas Daultry Ball
Samuel C.H. Lam
Adam C. Jones
Jared Tanner
AI4CE
185
2
0
25 Feb 2024
Neural Networks Asymptotic Behaviours for the Resolution of Inverse Problems
L. Debbio
Manuel Naviglio
Francesco Tarantelli
165
0
0
14 Feb 2024
Principled Weight Initialisation for Input-Convex Neural Networks
Pieter-Jan Hoedt
Günter Klambauer
218
11
0
19 Dec 2023
Commutative Width and Depth Scaling in Deep Neural Networks
Soufiane Hayou
222
2
0
02 Oct 2023
Fading memory as inductive bias in residual recurrent networks
Neural Networks (Neural Netw.), 2023
I. Dubinin
Felix Effenberger
220
9
0
27 Jul 2023
The Shaped Transformer: Attention Models in the Infinite Depth-and-Width Limit
Neural Information Processing Systems (NeurIPS), 2023
Lorenzo Noci
Chuning Li
Mufan Li
Bobby He
Thomas Hofmann
Chris J. Maddison
Daniel M. Roy
318
44
0
30 Jun 2023
Cerebras-GPT: Open Compute-Optimal Language Models Trained on the Cerebras Wafer-Scale Cluster
Nolan Dey
Gurpreet Gosal
Zhiming Chen
Chen
Hemant Khachane
William Marshall
Ribhu Pathria
Marvin Tom
Joel Hestness
MoE
LRM
287
122
0
06 Apr 2023
Understanding plasticity in neural networks
International Conference on Machine Learning (ICML), 2023
Clare Lyle
Zeyu Zheng
Evgenii Nikishin
Bernardo Avila-Pires
Razvan Pascanu
Will Dabney
AI4CE
503
136
0
02 Mar 2023
Byzantine-Robust Learning on Heterogeneous Data via Gradient Splitting
International Conference on Machine Learning (ICML), 2023
Yuchen Liu
Chen Chen
Lingjuan Lyu
Fangzhao Wu
Sai Wu
Gang Chen
262
22
0
13 Feb 2023
Width and Depth Limits Commute in Residual Networks
International Conference on Machine Learning (ICML), 2023
Soufiane Hayou
Greg Yang
239
17
0
01 Feb 2023
On the Initialisation of Wide Low-Rank Feedforward Neural Networks
Thiziri Nait Saada
Jared Tanner
172
2
0
31 Jan 2023
Mechanism of feature learning in deep fully connected networks and kernel machines that recursively learn features
Adityanarayanan Radhakrishnan
Daniel Beaglehole
Parthe Pandit
M. Belkin
FAtt
MLT
253
18
0
28 Dec 2022
Statistical Physics of Deep Neural Networks: Initialization toward Optimal Channels
Physical Review Research (Phys. Rev. Res.), 2022
Kangyu Weng
Aohua Cheng
Ziyang Zhang
Pei Sun
Yang Tian
276
5
0
04 Dec 2022
Analysis of Convolutions, Non-linearity and Depth in Graph Neural Networks using Neural Tangent Kernel
Mahalakshmi Sabanayagam
Pascal Esser
Debarghya Ghoshdastidar
379
4
0
18 Oct 2022
Component-Wise Natural Gradient Descent -- An Efficient Neural Network Optimization
International Symposium on Computing and Networking - Across Practical Development and Theoretical Research (ISAPDTR), 2022
Tran van Sang
Mhd Irvan
R. Yamaguchi
Toshiyuki Nakata
198
1
0
11 Oct 2022
On Scrambling Phenomena for Randomly Initialized Recurrent Networks
Neural Information Processing Systems (NeurIPS), 2022
Vaggos Chatziafratis
Ioannis Panageas
Clayton Sanford
S. Stavroulakis
195
2
0
11 Oct 2022
On skip connections and normalisation layers in deep optimisation
Neural Information Processing Systems (NeurIPS), 2022
L. MacDonald
Jack Valmadre
Hemanth Saratchandran
Simon Lucey
ODL
401
4
0
10 Oct 2022
Dynamical Isometry for Residual Networks
Advait Gadhikar
R. Burkholz
ODL
AI4CE
213
2
0
05 Oct 2022
Random orthogonal additive filters: a solution to the vanishing/exploding gradient of deep neural networks
IEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2022
Andrea Ceni
ODL
156
11
0
03 Oct 2022
Omnigrok: Grokking Beyond Algorithmic Data
International Conference on Learning Representations (ICLR), 2022
Ziming Liu
Eric J. Michaud
Max Tegmark
360
111
0
03 Oct 2022
On the infinite-depth limit of finite-width neural networks
Soufiane Hayou
246
24
0
03 Oct 2022
A Closer Look at Learned Optimization: Stability, Robustness, and Inductive Biases
Neural Information Processing Systems (NeurIPS), 2022
James Harrison
Luke Metz
Jascha Narain Sohl-Dickstein
235
30
0
22 Sep 2022
PIM-QAT: Neural Network Quantization for Processing-In-Memory (PIM) Systems
Qing Jin
Zhiyu Chen
J. Ren
Yanyu Li
Yanzhi Wang
Kai-Min Yang
MQ
135
7
0
18 Sep 2022
Scaling ResNets in the Large-depth Regime
Pierre Marion
Adeline Fermanian
Gérard Biau
Jean-Philippe Vert
384
18
0
14 Jun 2022
The Neural Covariance SDE: Shaped Infinite Depth-and-Width Networks at Initialization
Neural Information Processing Systems (NeurIPS), 2022
Mufan Li
Mihai Nica
Daniel M. Roy
384
44
0
06 Jun 2022
Entangled Residual Mappings
Mathias Lechner
Ramin Hasani
Z. Babaiee
Radu Grosu
Daniela Rus
T. Henzinger
Sepp Hochreiter
226
5
0
02 Jun 2022
Do Residual Neural Networks discretize Neural Ordinary Differential Equations?
Neural Information Processing Systems (NeurIPS), 2022
Michael E. Sander
Pierre Ablin
Gabriel Peyré
266
34
0
29 May 2022
Self-Consistent Dynamical Field Theory of Kernel Evolution in Wide Neural Networks
Neural Information Processing Systems (NeurIPS), 2022
Blake Bordelon
Cengiz Pehlevan
MLT
351
108
0
19 May 2022
Convergence and Implicit Regularization Properties of Gradient Descent for Deep Residual Networks
Social Science Research Network (SSRN), 2022
R. Cont
Alain Rossier
Renyuan Xu
MLT
377
6
0
14 Apr 2022
Deep Learning without Shortcuts: Shaping the Kernel with Tailored Rectifiers
International Conference on Learning Representations (ICLR), 2022
Guodong Zhang
Aleksandar Botev
James Martens
OffRL
230
30
0
15 Mar 2022
Tensor Programs V: Tuning Large Neural Networks via Zero-Shot Hyperparameter Transfer
Greg Yang
J. E. Hu
Igor Babuschkin
Szymon Sidor
Xiaodong Liu
David Farhi
Nick Ryder
J. Pachocki
Weizhu Chen
Jianfeng Gao
359
223
0
07 Mar 2022
Neural Tangent Kernel Beyond the Infinite-Width Limit: Effects of Depth and Initialization
International Conference on Machine Learning (ICML), 2022
Mariia Seleznova
Gitta Kutyniok
408
28
0
01 Feb 2022
Critical Initialization of Wide and Deep Neural Networks through Partial Jacobians: General Theory and Applications
Darshil Doshi
Tianyu He
Andrey Gromov
341
10
0
23 Nov 2021
Gradients are Not All You Need
Luke Metz
C. Freeman
S. Schoenholz
Tal Kachman
234
102
0
10 Nov 2021
A Johnson--Lindenstrauss Framework for Randomly Initialized CNNs
Ido Nachum
Jan Hkazla
Michael C. Gastpar
Anatoly Khina
180
0
0
03 Nov 2021
Free Probability for predicting the performance of feed-forward fully connected neural networks
Neural Information Processing Systems (NeurIPS), 2021
Reda Chhaibi
Tariq Daouda
E. Kahn
ODL
294
4
0
01 Nov 2021
Feature Learning and Signal Propagation in Deep Neural Networks
International Conference on Machine Learning (ICML), 2021
Yizhang Lou
Chris Mingard
Yoonsoo Nam
Soufiane Hayou
MDE
220
18
0
22 Oct 2021
The Future is Log-Gaussian: ResNets and Their Infinite-Depth-and-Width Limit at Initialization
Neural Information Processing Systems (NeurIPS), 2021
Mufan Li
Mihai Nica
Daniel M. Roy
316
36
0
07 Jun 2021
Regularization in ResNet with Stochastic Depth
Neural Information Processing Systems (NeurIPS), 2021
Soufiane Hayou
Fadhel Ayed
137
15
0
06 Jun 2021
1
2
3
Next