Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2006.14548
Cited By
Tensor Programs II: Neural Tangent Kernel for Any Architecture
25 June 2020
Greg Yang
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Tensor Programs II: Neural Tangent Kernel for Any Architecture"
29 / 29 papers shown
Title
Theoretical characterisation of the Gauss-Newton conditioning in Neural Networks
Jim Zhao
Sidak Pal Singh
Aurélien Lucchi
AI4CE
43
0
0
04 Nov 2024
Investigating the Impact of Model Complexity in Large Language Models
Jing Luo
Huiyuan Wang
Weiran Huang
34
0
0
01 Oct 2024
Input Space Mode Connectivity in Deep Neural Networks
Jakub Vrabel
Ori Shem-Ur
Yaron Oz
David Krueger
48
1
0
09 Sep 2024
u-
μ
\mu
μ
P: The Unit-Scaled Maximal Update Parametrization
Charlie Blake
C. Eichenberg
Josef Dean
Lukas Balles
Luke Y. Prince
Bjorn Deiseroth
Andres Felipe Cruz Salinas
Carlo Luschi
Samuel Weinbach
Douglas Orr
53
9
0
24 Jul 2024
NTK-Guided Few-Shot Class Incremental Learning
Jingren Liu
Zhong Ji
Yanwei Pang
Yunlong Yu
CLL
34
3
0
19 Mar 2024
Weak Correlations as the Underlying Principle for Linearization of Gradient-Based Learning Systems
Ori Shem-Ur
Yaron Oz
14
0
0
08 Jan 2024
On the Neural Tangent Kernel of Equilibrium Models
Zhili Feng
J. Zico Kolter
16
6
0
21 Oct 2023
Quantitative CLTs in Deep Neural Networks
Stefano Favaro
Boris Hanin
Domenico Marinucci
I. Nourdin
G. Peccati
BDL
23
11
0
12 Jul 2023
ADLER -- An efficient Hessian-based strategy for adaptive learning rate
Dario Balboni
D. Bacciu
ODL
14
0
0
25 May 2023
Global Optimality of Elman-type RNN in the Mean-Field Regime
Andrea Agazzi
Jian-Xiong Lu
Sayan Mukherjee
MLT
26
1
0
12 Mar 2023
A Theoretical Understanding of Shallow Vision Transformers: Learning, Generalization, and Sample Complexity
Hongkang Li
M. Wang
Sijia Liu
Pin-Yu Chen
ViT
MLT
35
56
0
12 Feb 2023
Width and Depth Limits Commute in Residual Networks
Soufiane Hayou
Greg Yang
42
14
0
01 Feb 2023
An Analysis of Attention via the Lens of Exchangeability and Latent Variable Models
Yufeng Zhang
Boyi Liu
Qi Cai
Lingxiao Wang
Zhaoran Wang
45
11
0
30 Dec 2022
Making Look-Ahead Active Learning Strategies Feasible with Neural Tangent Kernels
Mohamad Amin Mohamadi
Wonho Bae
Danica J. Sutherland
28
20
0
25 Jun 2022
Large-width asymptotics for ReLU neural networks with
α
α
α
-Stable initializations
Stefano Favaro
S. Fortini
Stefano Peluchetti
20
2
0
16 Jun 2022
Overcoming the Spectral Bias of Neural Value Approximation
Ge Yang
Anurag Ajay
Pulkit Agrawal
32
25
0
09 Jun 2022
Global Convergence of Over-parameterized Deep Equilibrium Models
Zenan Ling
Xingyu Xie
Qiuhao Wang
Zongpeng Zhang
Zhouchen Lin
27
12
0
27 May 2022
Empirical Phase Diagram for Three-layer Neural Networks with Infinite Width
Hanxu Zhou
Qixuan Zhou
Zhenyuan Jin
Tao Luo
Yaoyu Zhang
Zhi-Qin John Xu
22
20
0
24 May 2022
Self-Consistent Dynamical Field Theory of Kernel Evolution in Wide Neural Networks
Blake Bordelon
C. Pehlevan
MLT
24
79
0
19 May 2022
Tensor Programs V: Tuning Large Neural Networks via Zero-Shot Hyperparameter Transfer
Greg Yang
J. E. Hu
Igor Babuschkin
Szymon Sidor
Xiaodong Liu
David Farhi
Nick Ryder
J. Pachocki
Weizhu Chen
Jianfeng Gao
24
148
0
07 Mar 2022
Generalization Through The Lens Of Leave-One-Out Error
Gregor Bachmann
Thomas Hofmann
Aurélien Lucchi
44
7
0
07 Mar 2022
Inductive Biases and Variable Creation in Self-Attention Mechanisms
Benjamin L. Edelman
Surbhi Goel
Sham Kakade
Cyril Zhang
27
115
0
19 Oct 2021
Nonperturbative renormalization for the neural network-QFT correspondence
Harold Erbin
Vincent Lahoche
D. O. Samary
21
30
0
03 Aug 2021
Random Neural Networks in the Infinite Width Limit as Gaussian Processes
Boris Hanin
BDL
24
43
0
04 Jul 2021
The Limitations of Large Width in Neural Networks: A Deep Gaussian Process Perspective
Geoff Pleiss
John P. Cunningham
26
24
0
11 Jun 2021
A Neural Tangent Kernel Perspective of GANs
Jean-Yves Franceschi
Emmanuel de Bézenac
Ibrahim Ayed
Mickaël Chen
Sylvain Lamprier
Patrick Gallinari
29
26
0
10 Jun 2021
The Future is Log-Gaussian: ResNets and Their Infinite-Depth-and-Width Limit at Initialization
Mufan Bill Li
Mihai Nica
Daniel M. Roy
23
33
0
07 Jun 2021
Priors in Bayesian Deep Learning: A Review
Vincent Fortuin
UQCV
BDL
29
124
0
14 May 2021
Dynamical Isometry and a Mean Field Theory of CNNs: How to Train 10,000-Layer Vanilla Convolutional Neural Networks
Lechao Xiao
Yasaman Bahri
Jascha Narain Sohl-Dickstein
S. Schoenholz
Jeffrey Pennington
220
348
0
14 Jun 2018
1