ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1606.07365
  4. Cited By
Parallel SGD: When does averaging help?

Parallel SGD: When does averaging help?

23 June 2016
Jian Zhang
Christopher De Sa
Ioannis Mitliagkas
Christopher Ré
    MoMeFedML
ArXiv (abs)PDFHTML

Papers citing "Parallel SGD: When does averaging help?"

9 / 9 papers shown
Title
EDiT: A Local-SGD-Based Efficient Distributed Training Method for Large Language Models
EDiT: A Local-SGD-Based Efficient Distributed Training Method for Large Language Models
Jialiang Cheng
Ning Gao
Yun Yue
Zhiling Ye
Jiadi Jiang
Jian Sha
OffRL
150
1
0
10 Dec 2024
Communication optimization strategies for distributed deep neural
  network training: A survey
Communication optimization strategies for distributed deep neural network training: A survey
Shuo Ouyang
Dezun Dong
Yemao Xu
Liquan Xiao
116
12
0
06 Mar 2020
Asynchronous Parallel Stochastic Gradient for Nonconvex Optimization
Asynchronous Parallel Stochastic Gradient for Nonconvex Optimization
Xiangru Lian
Yijun Huang
Y. Li
Ji Liu
149
499
0
27 Jun 2015
Splash: User-friendly Programming Interface for Parallelizing Stochastic
  Algorithms
Splash: User-friendly Programming Interface for Parallelizing Stochastic Algorithms
Yuchen Zhang
Michael I. Jordan
82
20
0
24 Jun 2015
Global Convergence of Stochastic Gradient Descent for Some Non-convex
  Matrix Problems
Global Convergence of Stochastic Gradient Descent for Some Non-convex Matrix Problems
Christopher De Sa
K. Olukotun
Christopher Ré
104
150
0
05 Nov 2014
DimmWitted: A Study of Main-Memory Statistical Analytics
DimmWitted: A Study of Main-Memory Statistical Analytics
Ce Zhang
Christopher Ré
171
146
0
28 Mar 2014
HOGWILD!: A Lock-Free Approach to Parallelizing Stochastic Gradient
  Descent
HOGWILD!: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent
Feng Niu
Benjamin Recht
Christopher Ré
Stephen J. Wright
216
2,274
0
28 Jun 2011
Distributed Delayed Stochastic Optimization
Distributed Delayed Stochastic Optimization
Alekh Agarwal
John C. Duchi
145
627
0
28 Apr 2011
Optimal Distributed Online Prediction using Mini-Batches
Optimal Distributed Online Prediction using Mini-Batches
O. Dekel
Ran Gilad-Bachrach
Ohad Shamir
Lin Xiao
287
685
0
07 Dec 2010
1