Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2202.10435
Cited By
Survey on Large Scale Neural Network Training
21 February 2022
Julia Gusak
Daria Cherniuk
Alena Shilova
A. Katrutsa
Daniel Bershatsky
Xunyi Zhao
Lionel Eyraud-Dubois
Oleg Shlyazhko
Denis Dimitrov
Ivan V. Oseledets
Olivier Beaumont
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Survey on Large Scale Neural Network Training"
9 / 9 papers shown
Title
Quantization of Large Language Models with an Overdetermined Basis
D. Merkulov
Daria Cherniuk
Alexander Rudikov
Ivan V. Oseledets
Ekaterina A. Muravleva
A. Mikhalev
Boris Kashin
MQ
21
0
0
15 Apr 2024
Systems for Parallel and Distributed Large-Model Deep Learning Training
Kabir Nagrecha
GNN
VLM
MoE
18
7
0
06 Jan 2023
Eco2AI: carbon emissions tracking of machine learning models as the first step towards sustainable AI
S. Budennyy
V. Lazarev
N. Zakharenko
A. Korovin
Olga Plosskaya
...
Ivan V. Oseledets
I. Barsola
Ilya M. Egorov
A. Kosterina
L. Zhukov
24
89
0
31 Jul 2022
Few-Bit Backward: Quantized Gradients of Activation Functions for Memory Footprint Reduction
Georgii Sergeevich Novikov
Daniel Bershatsky
Julia Gusak
Alex Shonenkov
Denis Dimitrov
Ivan V. Oseledets
MQ
14
17
0
01 Feb 2022
Varuna: Scalable, Low-cost Training of Massive Deep Learning Models
Sanjith Athlur
Nitika Saran
Muthian Sivathanu
R. Ramjee
Nipun Kwatra
GNN
28
79
0
07 Nov 2021
M6-10T: A Sharing-Delinking Paradigm for Efficient Multi-Trillion Parameter Pretraining
Junyang Lin
An Yang
Jinze Bai
Chang Zhou
Le Jiang
...
Jie M. Zhang
Yong Li
Wei Lin
Jingren Zhou
Hongxia Yang
MoE
84
43
0
08 Oct 2021
Chimera: Efficiently Training Large-Scale Neural Networks with Bidirectional Pipelines
Shigang Li
Torsten Hoefler
GNN
AI4CE
LRM
77
130
0
14 Jul 2021
ZeRO-Offload: Democratizing Billion-Scale Model Training
Jie Ren
Samyam Rajbhandari
Reza Yazdani Aminabadi
Olatunji Ruwase
Shuangyang Yang
Minjia Zhang
Dong Li
Yuxiong He
MoE
160
413
0
18 Jan 2021
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
N. Keskar
Dheevatsa Mudigere
J. Nocedal
M. Smelyanskiy
P. T. P. Tang
ODL
273
2,878
0
15 Sep 2016
1