Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1712.08969
Cited By
Mean Field Residual Networks: On the Edge of Chaos
24 December 2017
Greg Yang
S. Schoenholz
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Mean Field Residual Networks: On the Edge of Chaos"
50 / 52 papers shown
Title
Principled Architecture-aware Scaling of Hyperparameters
Wuyang Chen
Junru Wu
Zhangyang Wang
Boris Hanin
AI4CE
49
0
0
27 Feb 2024
Fading memory as inductive bias in residual recurrent networks
I. Dubinin
Felix Effenberger
43
4
0
27 Jul 2023
Understanding plasticity in neural networks
Clare Lyle
Zeyu Zheng
Evgenii Nikishin
Bernardo Avila-Pires
Razvan Pascanu
Will Dabney
AI4CE
40
98
0
02 Mar 2023
Width and Depth Limits Commute in Residual Networks
Soufiane Hayou
Greg Yang
47
14
0
01 Feb 2023
On the Initialisation of Wide Low-Rank Feedforward Neural Networks
Thiziri Nait Saada
Jared Tanner
13
1
0
31 Jan 2023
Statistical Physics of Deep Neural Networks: Initialization toward Optimal Channels
Kangyu Weng
Aohua Cheng
Ziyang Zhang
Pei Sun
Yang Tian
53
2
0
04 Dec 2022
Component-Wise Natural Gradient Descent -- An Efficient Neural Network Optimization
Tran van Sang
Mhd Irvan
R. Yamaguchi
Toshiyuki Nakata
15
1
0
11 Oct 2022
On skip connections and normalisation layers in deep optimisation
L. MacDonald
Jack Valmadre
Hemanth Saratchandran
Simon Lucey
ODL
32
1
0
10 Oct 2022
Dynamical Isometry for Residual Networks
Advait Gadhikar
R. Burkholz
ODL
AI4CE
40
2
0
05 Oct 2022
Random orthogonal additive filters: a solution to the vanishing/exploding gradient of deep neural networks
Andrea Ceni
ODL
25
3
0
03 Oct 2022
A Closer Look at Learned Optimization: Stability, Robustness, and Inductive Biases
James Harrison
Luke Metz
Jascha Narain Sohl-Dickstein
49
22
0
22 Sep 2022
PIM-QAT: Neural Network Quantization for Processing-In-Memory (PIM) Systems
Qing Jin
Zhiyu Chen
J. Ren
Yanyu Li
Yanzhi Wang
Kai-Min Yang
MQ
18
2
0
18 Sep 2022
Scaling ResNets in the Large-depth Regime
Pierre Marion
Adeline Fermanian
Gérard Biau
Jean-Philippe Vert
26
16
0
14 Jun 2022
Entangled Residual Mappings
Mathias Lechner
Ramin Hasani
Z. Babaiee
Radu Grosu
Daniela Rus
T. Henzinger
Sepp Hochreiter
14
5
0
02 Jun 2022
Do Residual Neural Networks discretize Neural Ordinary Differential Equations?
Michael E. Sander
Pierre Ablin
Gabriel Peyré
35
25
0
29 May 2022
Self-Consistent Dynamical Field Theory of Kernel Evolution in Wide Neural Networks
Blake Bordelon
Cengiz Pehlevan
MLT
40
77
0
19 May 2022
Deep Learning without Shortcuts: Shaping the Kernel with Tailored Rectifiers
Guodong Zhang
Aleksandar Botev
James Martens
OffRL
39
26
0
15 Mar 2022
Tensor Programs V: Tuning Large Neural Networks via Zero-Shot Hyperparameter Transfer
Greg Yang
J. E. Hu
Igor Babuschkin
Szymon Sidor
Xiaodong Liu
David Farhi
Nick Ryder
J. Pachocki
Weizhu Chen
Jianfeng Gao
26
148
0
07 Mar 2022
Gradients are Not All You Need
Luke Metz
C. Freeman
S. Schoenholz
Tal Kachman
30
93
0
10 Nov 2021
A Johnson--Lindenstrauss Framework for Randomly Initialized CNNs
Ido Nachum
Jan Hkazla
Michael C. Gastpar
Anatoly Khina
36
0
0
03 Nov 2021
The Future is Log-Gaussian: ResNets and Their Infinite-Depth-and-Width Limit at Initialization
Mufan Li
Mihai Nica
Daniel M. Roy
32
33
0
07 Jun 2021
Initialization and Regularization of Factorized Neural Layers
M. Khodak
Neil A. Tenenholtz
Lester W. Mackey
Nicolò Fusi
65
56
0
03 May 2021
ActNN: Reducing Training Memory Footprint via 2-Bit Activation Compressed Training
Jianfei Chen
Lianmin Zheng
Z. Yao
Dequan Wang
Ion Stoica
Michael W. Mahoney
Joseph E. Gonzalez
MQ
27
74
0
29 Apr 2021
Advances in Electron Microscopy with Deep Learning
Jeffrey M. Ede
37
2
0
04 Jan 2021
Stable ResNet
Soufiane Hayou
Eugenio Clerico
Bo He
George Deligiannidis
Arnaud Doucet
Judith Rousseau
ODL
SSeg
46
51
0
24 Oct 2020
BYOL works even without batch statistics
Pierre Harvey Richemond
Jean-Bastien Grill
Florent Altché
Corentin Tallec
Florian Strub
...
Samuel L. Smith
Soham De
Razvan Pascanu
Bilal Piot
Michal Valko
SSL
250
114
0
20 Oct 2020
Understanding Approximate Fisher Information for Fast Convergence of Natural Gradient Descent in Wide Neural Networks
Ryo Karakida
Kazuki Osawa
22
25
0
02 Oct 2020
Tensor Programs III: Neural Matrix Laws
Greg Yang
14
44
0
22 Sep 2020
Review: Deep Learning in Electron Microscopy
Jeffrey M. Ede
36
79
0
17 Sep 2020
Beyond Signal Propagation: Is Feature Diversity Necessary in Deep Neural Network Initialization?
Yaniv Blumenfeld
D. Gilboa
Daniel Soudry
ODL
30
13
0
02 Jul 2020
Tensor Programs II: Neural Tangent Kernel for Any Architecture
Greg Yang
58
135
0
25 Jun 2020
ReZero is All You Need: Fast Convergence at Large Depth
Thomas C. Bachlechner
Bodhisattwa Prasad Majumder
H. H. Mao
G. Cottrell
Julian McAuley
AI4CE
24
276
0
10 Mar 2020
On the distance between two neural networks and the stability of learning
Jeremy Bernstein
Arash Vahdat
Yisong Yue
Xuan Li
ODL
200
57
0
09 Feb 2020
Towards Efficient Training for Neural Network Quantization
Qing Jin
Linjie Yang
Zhenyu A. Liao
MQ
19
42
0
21 Dec 2019
Mean field theory for deep dropout networks: digging up gradient backpropagation deeply
Wei Huang
R. Xu
Weitao Du
Yutian Zeng
Yunce Zhao
25
6
0
19 Dec 2019
Optimization for deep learning: theory and algorithms
Ruoyu Sun
ODL
27
168
0
19 Dec 2019
Neural Tangents: Fast and Easy Infinite Neural Networks in Python
Roman Novak
Lechao Xiao
Jiri Hron
Jaehoon Lee
Alexander A. Alemi
Jascha Narain Sohl-Dickstein
S. Schoenholz
38
225
0
05 Dec 2019
Tensor Programs I: Wide Feedforward or Recurrent Neural Networks of Any Architecture are Gaussian Processes
Greg Yang
33
193
0
28 Oct 2019
On the expected behaviour of noise regularised deep neural networks as Gaussian processes
Arnu Pretorius
Herman Kamper
Steve Kroon
27
9
0
12 Oct 2019
The Normalization Method for Alleviating Pathological Sharpness in Wide Neural Networks
Ryo Karakida
S. Akaho
S. Amari
27
40
0
07 Jun 2019
Infinitely deep neural networks as diffusion processes
Stefano Peluchetti
Stefano Favaro
ODL
14
31
0
27 May 2019
Mean-field Analysis of Batch Normalization
Ming-Bo Wei
J. Stokes
D. Schwab
MLT
33
8
0
06 Mar 2019
Wide Neural Networks of Any Depth Evolve as Linear Models Under Gradient Descent
Jaehoon Lee
Lechao Xiao
S. Schoenholz
Yasaman Bahri
Roman Novak
Jascha Narain Sohl-Dickstein
Jeffrey Pennington
52
1,077
0
18 Feb 2019
Dynamical Isometry and a Mean Field Theory of LSTMs and GRUs
D. Gilboa
B. Chang
Minmin Chen
Greg Yang
S. Schoenholz
Ed H. Chi
Jeffrey Pennington
34
40
0
25 Jan 2019
NIPS - Not Even Wrong? A Systematic Review of Empirically Complete Demonstrations of Algorithmic Effectiveness in the Machine Learning and Artificial Intelligence Literature
Franz J. Király
Bilal A. Mateen
R. Sonabend
18
10
0
18 Dec 2018
On the Convergence Rate of Training Recurrent Neural Networks
Zeyuan Allen-Zhu
Yuanzhi Li
Zhao Song
29
191
0
29 Oct 2018
Bayesian Deep Convolutional Networks with Many Channels are Gaussian Processes
Roman Novak
Lechao Xiao
Jaehoon Lee
Yasaman Bahri
Greg Yang
Jiri Hron
Daniel A. Abolafia
Jeffrey Pennington
Jascha Narain Sohl-Dickstein
UQCV
BDL
25
307
0
11 Oct 2018
Fisher Information and Natural Gradient Learning of Random Deep Networks
S. Amari
Ryo Karakida
Masafumi Oizumi
19
34
0
22 Aug 2018
Dynamical Isometry and a Mean Field Theory of CNNs: How to Train 10,000-Layer Vanilla Convolutional Neural Networks
Lechao Xiao
Yasaman Bahri
Jascha Narain Sohl-Dickstein
S. Schoenholz
Jeffrey Pennington
244
349
0
14 Jun 2018
Universal Statistics of Fisher Information in Deep Neural Networks: Mean Field Approach
Ryo Karakida
S. Akaho
S. Amari
FedML
47
140
0
04 Jun 2018
1
2
Next