Scalable K-FAC Training for Deep Neural Networks with Distributed Preconditioning

30 June 2022

Papers citing "Scalable K-FAC Training for Deep Neural Networks with Distributed Preconditioning"

7 / 7 papers shown

Title
On the Parameterization of Second-Order Optimization Effective Towards the Infinite Width Satoki Ishikawa Ryo Karakida 24 2 0 19 Dec 2023
FedLPA: One-shot Federated Learning with Layer-Wise Posterior Aggregation Xiang Liu Liangxi Liu Feiyang Ye Yunheng Shen Xia Li Linshan Jiang Jialin Li 23 4 0 30 Sep 2023
Dual Gauss-Newton Directions for Deep Learning Vincent Roulet Mathieu Blondel ODL 16 0 0 17 Aug 2023
Eva: A General Vectorized Approximation Framework for Second-order Optimization Lin Zhang S. Shi Bo-wen Li 13 1 0 04 Aug 2023
Compute-Efficient Deep Learning: Algorithmic Trends and Opportunities Brian Bartoldson B. Kailkhura Davis W. Blalock 29 47 0 13 Oct 2022
Accelerating Distributed K-FAC with Smart Parallelism of Computing and Communication Tasks S. Shi Lin Zhang Bo-wen Li 24 9 0 14 Jul 2021
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima N. Keskar Dheevatsa Mudigere J. Nocedal M. Smelyanskiy P. T. P. Tang ODL 273 2,886 0 15 Sep 2016