Federated learning has gained popularity for distributed learning without aggregating sensitive data from clients. But meanwhile, the distributed and isolated nature of data isolation may be complicated by data quality, making it more vulnerable to noisy labels. Many efforts exist to defend against the negative impacts of noisy labels in centralized or federated settings. However, there is a lack of a benchmark that comprehensively considers the impact of noisy labels in a wide variety of typical FL settings. In this work, we serve the first standardized benchmark that can help researchers fully explore potential federated noisy settings. Also, we conduct comprehensive experiments to explore the characteristics of these data settings and the comparison across baselines, which may guide method development in the future. We highlight the 20 basic settings for 6 datasets proposed in our benchmark and standardized simulation pipeline for federated noisy label learning, including implementations of 9 baselines. We hope this benchmark can facilitate idea verification in federated learning with noisy labels. \texttt{FedNoisy} is available at \codeword{this https URL}.
View on arXiv@article{liang2025_2306.11650, title={ FedNoisy: Federated Noisy Label Learning Benchmark }, author={ Siqi Liang and Jintao Huang and Junyuan Hong and Dun Zeng and Jiayu Zhou and Zenglin Xu }, journal={arXiv preprint arXiv:2306.11650}, year={ 2025 } }