$γ$ -FedHT: Stepsize-Aware Hard-Threshold Gradient Compression in Federated Learning

18 May 2025

Abstract

Gradient compression can effectively alleviate communication bottlenecks in Federated Learning (FL). Contemporary state-of-the-art sparse compressors, such as Top- $k$ , exhibit high computational complexity, up to $\mathcal{O}(d\log_2{k})$ , where $d$ is the number of model parameters. The hard-threshold compressor, which simply transmits elements with absolute values higher than a fixed threshold, is thus proposed to reduce the complexity to $\mathcal{O}(d)$ . However, the hard-threshold compression causes accuracy degradation in FL, where the datasets are non-IID and the stepsize $\gamma$ is decreasing for model convergence. The decaying stepsize reduces the updates and causes the compression ratio of the hard-threshold compression to drop rapidly to an aggressive ratio. At or below this ratio, the model accuracy has been observed to degrade severely. To address this, we propose $\gamma$ -FedHT, a stepsize-aware low-cost compressor with Error-Feedback to guarantee convergence. Given that the traditional theoretical framework of FL does not consider Error-Feedback, we introduce the fundamental conversation of Error-Feedback. We prove that $\gamma$ -FedHT has the convergence rate of $\mathcal{O}(\frac{1}{T})$ ( $T$ representing total training iterations) under $\mu$ -strongly convex cases and $\mathcal{O}(\frac{1}{\sqrt{T}})$ under non-convex cases, \textit{same as FedAVG}. Extensive experiments demonstrate that $\gamma$ -FedHT improves accuracy by up to $7.42\%$ over Top- $k$ under equal communication traffic on various non-IID image datasets.

View on arXiv

@article{lu2025_2505.12479,
  title={ $γ$-FedHT: Stepsize-Aware Hard-Threshold Gradient Compression in Federated Learning },
  author={ Rongwei Lu and Yutong Jiang and Jinrui Zhang and Chunyang Li and Yifei Zhu and Bin Chen and Zhi Wang },
  journal={arXiv preprint arXiv:2505.12479},
  year={ 2025 }
}

Comments on this paper

γγγ-FedHT: Stepsize-Aware Hard-Threshold Gradient Compression in Federated Learning

$γ$ -FedHT: Stepsize-Aware Hard-Threshold Gradient Compression in Federated Learning