Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2306.10598
Cited By
DropCompute: simple and more robust distributed synchronous training via compute variance reduction
18 June 2023
Niv Giladi
Shahar Gottlieb
Moran Shkolnik
A. Karnieli
Ron Banner
Elad Hoffer
Kfir Y. Levy
Daniel Soudry
Re-assign community
ArXiv
PDF
HTML
Papers citing
"DropCompute: simple and more robust distributed synchronous training via compute variance reduction"
6 / 6 papers shown
Title
Understanding Stragglers in Large Model Training Using What-if Analysis
Jinkun Lin
Ziheng Jiang
Zuquan Song
Sida Zhao
Menghan Yu
...
Shuguang Wang
Haibin Lin
Xin Liu
Aurojit Panda
Jinyang Li
20
0
0
09 May 2025
From promise to practice: realizing high-performance decentralized training
Zesen Wang
Jiaojiao Zhang
Xuyang Wu
M. Johansson
13
0
0
15 Oct 2024
Stochastic Training is Not Necessary for Generalization
Jonas Geiping
Micah Goldblum
Phillip E. Pope
Michael Moeller
Tom Goldstein
81
72
0
29 Sep 2021
ScaleCom: Scalable Sparsified Gradient Compression for Communication-Efficient Distributed Training
Chia-Yu Chen
Jiamin Ni
Songtao Lu
Xiaodong Cui
Pin-Yu Chen
...
Naigang Wang
Swagath Venkataramani
Vijayalakshmi Srinivasan
Wei Zhang
K. Gopalakrishnan
27
65
0
21 Apr 2021
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
226
4,424
0
23 Jan 2020
Optimal Distributed Online Prediction using Mini-Batches
O. Dekel
Ran Gilad-Bachrach
Ohad Shamir
Lin Xiao
164
684
0
07 Dec 2010
1