ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2312.12226
  4. Cited By
On the Parameterization of Second-Order Optimization Effective Towards
  the Infinite Width

On the Parameterization of Second-Order Optimization Effective Towards the Infinite Width

19 December 2023
Satoki Ishikawa
Ryo Karakida
ArXivPDFHTML

Papers citing "On the Parameterization of Second-Order Optimization Effective Towards the Infinite Width"

5 / 5 papers shown
Title
Gradient Descent on Neurons and its Link to Approximate Second-Order
  Optimization
Gradient Descent on Neurons and its Link to Approximate Second-Order Optimization
Frederik Benzing
ODL
35
23
0
28 Jan 2022
Accelerating Distributed K-FAC with Smart Parallelism of Computing and
  Communication Tasks
Accelerating Distributed K-FAC with Smart Parallelism of Computing and Communication Tasks
S. Shi
Lin Zhang
Bo-wen Li
13
9
0
14 Jul 2021
Megatron-LM: Training Multi-Billion Parameter Language Models Using
  Model Parallelism
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
M. Shoeybi
M. Patwary
Raul Puri
P. LeGresley
Jared Casper
Bryan Catanzaro
MoE
243
1,815
0
17 Sep 2019
Dynamical Isometry and a Mean Field Theory of CNNs: How to Train
  10,000-Layer Vanilla Convolutional Neural Networks
Dynamical Isometry and a Mean Field Theory of CNNs: How to Train 10,000-Layer Vanilla Convolutional Neural Networks
Lechao Xiao
Yasaman Bahri
Jascha Narain Sohl-Dickstein
S. Schoenholz
Jeffrey Pennington
220
347
0
14 Jun 2018
Efficient Estimation of Word Representations in Vector Space
Efficient Estimation of Word Representations in Vector Space
Tomáš Mikolov
Kai Chen
G. Corrado
J. Dean
3DV
228
31,150
0
16 Jan 2013
1