One-Shot Federated Learning for Model Clustering and Learning in
Heterogeneous Environments
- FedML
The paper presents a communication efficient approach for federated learning in heterogeneous environments in which users obtain data from one of different data distributions. In the proposed setup, the number of data distributions and their underlying statistical properties as well as the user cluster structure (i.e., the grouping of users based on the data distributions they sample) are apriori unknown. A one-shot decentralized learning approach, based on a single round of communication between the users and the server, is proposed with the objective of learning the true model at each user. The proposed one-shot approach, based on local computations at the users and a convex clustering based aggregation step at the server is shown to provide strong learning guarantees in such heterogeneous environments. In particular, it is shown that for strongly convex learning setups, as long as the number of data points per user is above a threshold, the proposed approach achieves order-optimal mean-squared error (MSE) rates in terms of the sample size with respect to a hypothetical oracle that has access to all data points at all users and perfect information about the number of different distributions and the user cluster structure, i.e., assignment of distributions to users. An explicit characterization of the threshold is provided in terms of the problem parameters. Numerical experiments illustrate the findings and corroborate the performance of the proposed method.
View on arXiv