392

γγ-FedHT: Stepsize-Aware Hard-Threshold Gradient Compression in Federated Learning

IEEE Conference on Computer Communications (IEEE INFOCOM), 2025
Main:9 Pages
5 Figures
Bibliography:1 Pages
4 Tables
Abstract

Gradient compression can effectively alleviate communication bottlenecks in Federated Learning (FL). Contemporary state-of-the-art sparse compressors, such as Top-kk, exhibit high computational complexity, up to O(dlog2k)\mathcal{O}(d\log_2{k}), where dd is the number of model parameters. The hard-threshold compressor, which simply transmits elements with absolute values higher than a fixed threshold, is thus proposed to reduce the complexity to O(d)\mathcal{O}(d). However, the hard-threshold compression causes accuracy degradation in FL, where the datasets are non-IID and the stepsize γ\gamma is decreasing for model convergence. The decaying stepsize reduces the updates and causes the compression ratio of the hard-threshold compression to drop rapidly to an aggressive ratio. At or below this ratio, the model accuracy has been observed to degrade severely. To address this, we propose γ\gamma-FedHT, a stepsize-aware low-cost compressor with Error-Feedback to guarantee convergence. Given that the traditional theoretical framework of FL does not consider Error-Feedback, we introduce the fundamental conversation of Error-Feedback. We prove that γ\gamma-FedHT has the convergence rate of O(1T)\mathcal{O}(\frac{1}{T}) (TT representing total training iterations) under μ\mu-strongly convex cases and O(1T)\mathcal{O}(\frac{1}{\sqrt{T}}) under non-convex cases, \textit{same as FedAVG}. Extensive experiments demonstrate that γ\gamma-FedHT improves accuracy by up to 7.42%7.42\% over Top-kk under equal communication traffic on various non-IID image datasets.

View on arXiv
Comments on this paper