54
40
v1v2 (latest)

Revisiting EXTRA for Smooth Distributed Optimization

Abstract

EXTRA is a popular method for dencentralized distributed optimization and has broad applications. This paper revisits EXTRA. First, we give a sharp complexity analysis for EXTRA with the improved O((Lμ+11σ2(W))log1ϵ(1σ2(W)))O\left(\left(\frac{L}{\mu}+\frac{1}{1-\sigma_2(W)}\right)\log\frac{1}{\epsilon(1-\sigma_2(W))}\right) communication and computation complexities for μ\mu-strongly convex and LL-smooth problems, where σ2(W)\sigma_2(W) is the second largest singular value of the weight matrix WW. When the strong convexity is absent, we prove the O((Lϵ+11σ2(W))log11σ2(W))O\left(\left(\frac{L}{\epsilon}+\frac{1}{1-\sigma_2(W)}\right)\log\frac{1}{1-\sigma_2(W)}\right) complexities. Then, we use the Catalyst framework to accelerate EXTRA and obtain the O(Lμ(1σ2(W))logLμ(1σ2(W))log1ϵ)O\left(\sqrt{\frac{L}{\mu(1-\sigma_2(W))}}\log\frac{ L}{\mu(1-\sigma_2(W))}\log\frac{1}{\epsilon}\right) communication and computation complexities for strongly convex and smooth problems and the O(Lϵ(1σ2(W))log1ϵ(1σ2(W)))O\left(\sqrt{\frac{L}{\epsilon(1-\sigma_2(W))}}\log\frac{1}{\epsilon(1-\sigma_2(W))}\right) complexities for non-strongly convex ones. Our communication complexities of the accelerated EXTRA are only worse by the factors of (logLμ(1σ2(W)))\left(\log\frac{L}{\mu(1-\sigma_2(W))}\right) and (log1ϵ(1σ2(W)))\left(\log\frac{1}{\epsilon(1-\sigma_2(W))}\right) from the lower complexity bounds for strongly convex and non-strongly convex problems, respectively.

View on arXiv
Comments on this paper