Federated learning (FL) enables collaborative model training across distributed clients without sharing raw data, yet its scalability is limited by synchronization overhead. Asynchronous federated learning (AFL) alleviates this issue by allowing clients to communicate independently, thereby improving wall-clock efficiency in large-scale, hardware-heterogeneous environments. However, asynchrony introduces updates computed on outdated global models (staleness) that can destabilize optimization and hinder convergence. We propose FedRevive, an AFL framework that revives stale updates through data-free knowledge distillation (DFKD). FedRevive integrates parameter-space aggregation with a lightweight, server-side DFKD process that transfers knowledge from stale client updates to the current global model without access to data. A meta-learned generator synthesizes pseudo-samples used for multi-teacher distillation. A hybrid aggregation scheme that combines raw with DFKD updates effectively mitigates staleness while retaining AFL scalability. Experiments on various vision and text benchmarks show that FedRevive achieves faster training by up to 38.4% and higher final accuracy by up to 16.5% than asynchronous baselines.

View on arXiv

Comments on this paper