Understanding Generalization of Federated Learning: the Trade-off between Model Stability and Optimization

25 November 2024

Abstract

Federated Learning (FL) is a distributed learning approach that trains machine learning models across multiple devices while keeping their local data private. However, FL often faces challenges due to data heterogeneity, leading to inconsistent local optima among clients. These inconsistencies can cause unfavorable convergence behavior and generalization performance degradation. Existing studies mainly describe this issue through \textit{convergence analysis}, focusing on how well a model fits training data, or through \textit{algorithmic stability}, which examines the generalization gap. However, neither approach precisely captures the generalization performance of FL algorithms, especially for neural networks. This paper introduces an innovative generalization dynamics analysis framework, named as Libra, for algorithm-dependent excess risk minimization, highlighting the trade-offs between model stability and optimization. Through this framework, we show how the generalization of FL algorithms is affected by the interplay of algorithmic stability and optimization. This framework applies to standard federated optimization and its advanced variants, such as server momentum. Our findings suggest that larger local steps or momentum accelerate convergence but enlarge stability, while yielding a better minimum excess risk. These insights can guide the design of future algorithms to achieve stronger generalization.

View on arXiv

@article{zeng2025_2411.16303,
  title={ Understanding Generalization of Federated Learning: the Trade-off between Model Stability and Optimization },
  author={ Dun Zeng and Zheshun Wu and Shiyu Liu and Yu Pan and Xiaoying Tang and Zenglin Xu },
  journal={arXiv preprint arXiv:2411.16303},
  year={ 2025 }
}

Comments on this paper