Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2312.12226
Cited By
On the Parameterization of Second-Order Optimization Effective Towards the Infinite Width
19 December 2023
Satoki Ishikawa
Ryo Karakida
Re-assign community
ArXiv
PDF
HTML
Papers citing
"On the Parameterization of Second-Order Optimization Effective Towards the Infinite Width"
5 / 5 papers shown
Title
Gradient Descent on Neurons and its Link to Approximate Second-Order Optimization
Frederik Benzing
ODL
35
23
0
28 Jan 2022
Accelerating Distributed K-FAC with Smart Parallelism of Computing and Communication Tasks
S. Shi
Lin Zhang
Bo-wen Li
13
9
0
14 Jul 2021
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
M. Shoeybi
M. Patwary
Raul Puri
P. LeGresley
Jared Casper
Bryan Catanzaro
MoE
243
1,815
0
17 Sep 2019
Dynamical Isometry and a Mean Field Theory of CNNs: How to Train 10,000-Layer Vanilla Convolutional Neural Networks
Lechao Xiao
Yasaman Bahri
Jascha Narain Sohl-Dickstein
S. Schoenholz
Jeffrey Pennington
220
347
0
14 Jun 2018
Efficient Estimation of Word Representations in Vector Space
Tomáš Mikolov
Kai Chen
G. Corrado
J. Dean
3DV
228
31,150
0
16 Jan 2013
1