227
v1v2 (latest)

Federated Learning: A Stochastic Approximation Approach

Main:28 Pages
22 Figures
Bibliography:2 Pages
1 Tables
Abstract

This paper considers the Federated learning (FL) in a stochastic approximation (SA) framework. Here, each client ii trains a local model using its dataset D(i)\mathcal{D}^{(i)} and periodically transmits the model parameters wn(i)w^{(i)}_n to a central server, where they are aggregated into a global model parameter wˉn\bar{w}_n and sent back. The clients continue their training by re-initializing their local models with the global model parameters.Prior works typically assumed constant (and often identical) step sizes (learning rates) across clients for model training. As a consequence the aggregated model converges only in expectation. In this work, client-specific tapering step sizes an(i)a^{(i)}_n are used. The global model is shown to track an ODE with a forcing function equal to the weighted sum of the negative gradients of the individual clients. The weights being the limiting ratios p(i)=limnan(i)an(1)p^{(i)}=\lim_{n \to \infty} \frac{a^{(i)}_n}{a^{(1)}_n} of the step sizes, where an(1)an(i),na^{(1)}_n \geq a^{(i)}_n, \forall n. Unlike the constant step sizes, the convergence here is with probability one.In this framework, the clients with the larger p(i)p^{(i)} exert a greater influence on the global model than those with smaller p(i)p^{(i)}, which can be used to favor clients that have rare and uncommon data. Numerical experiments were conducted to validate the convergence and demonstrate the choice of step-sizes for regulating the influence of the clients.

View on arXiv
Comments on this paper