-FedHT: Stepsize-Aware Hard-Threshold Gradient Compression in Federated Learning
- FedML
Gradient compression can effectively alleviate communication bottlenecks in Federated Learning (FL). Contemporary state-of-the-art sparse compressors, such as Top-, exhibit high computational complexity, up to , where is the number of model parameters. The hard-threshold compressor, which simply transmits elements with absolute values higher than a fixed threshold, is thus proposed to reduce the complexity to . However, the hard-threshold compression causes accuracy degradation in FL, where the datasets are non-IID and the stepsize is decreasing for model convergence. The decaying stepsize reduces the updates and causes the compression ratio of the hard-threshold compression to drop rapidly to an aggressive ratio. At or below this ratio, the model accuracy has been observed to degrade severely. To address this, we propose -FedHT, a stepsize-aware low-cost compressor with Error-Feedback to guarantee convergence. Given that the traditional theoretical framework of FL does not consider Error-Feedback, we introduce the fundamental conversation of Error-Feedback. We prove that -FedHT has the convergence rate of ( representing total training iterations) under -strongly convex cases and under non-convex cases, \textit{same as FedAVG}. Extensive experiments demonstrate that -FedHT improves accuracy by up to over Top- under equal communication traffic on various non-IID image datasets.
View on arXiv