32
4

DR-DSGD: A Distributionally Robust Decentralized Learning Algorithm over Graphs

Abstract

In this paper, we propose to solve a regularized distributionally robust learning problem in the decentralized setting, taking into account the data distribution shift. By adding a Kullback-Liebler regularization function to the robust min-max optimization problem, the learning problem can be reduced to a modified robust minimization problem and solved efficiently. Leveraging the newly formulated optimization problem, we propose a robust version of Decentralized Stochastic Gradient Descent (DSGD), coined Distributionally Robust Decentralized Stochastic Gradient Descent (DR-DSGD). Under some mild assumptions and provided that the regularization parameter is larger than one, we theoretically prove that DR-DSGD achieves a convergence rate of O(1/KT+K/T)\mathcal{O}\left(1/\sqrt{KT} + K/T\right), where KK is the number of devices and TT is the number of iterations. Simulation results show that our proposed algorithm can improve the worst distribution test accuracy by up to 10%10\%. Moreover, DR-DSGD is more communication-efficient than DSGD since it requires fewer communication rounds (up to 2020 times less) to achieve the same worst distribution test accuracy target. Furthermore, the conducted experiments reveal that DR-DSGD results in a fairer performance across devices in terms of test accuracy.

View on arXiv
Comments on this paper