Partial Knowledge Distillation for Alleviating the Inherent Inter-Class Discrepancy in Federated Learning

Substantial efforts have been devoted to alleviating the impact of the long-tailed class distribution in federated learning. In this work, we observe an interesting phenomenon that certain weak classes consistently exist even for class-balanced learning. These weak classes, different from the minority classes in the previous works, are inherent to data and remain fairly consistent for various network structures, learning paradigms, and data partitioning methods. The inherent inter-class accuracy discrepancy can reach over 36.9% for federated learning on the FashionMNIST and CIFAR-10 datasets, even when the class distribution is balanced both globally and locally. In this study, we empirically analyze the potential reason for this phenomenon. Furthermore, a partial knowledge distillation (PKD) method is proposed to improve the model's classification accuracy for weak classes. In this approach, knowledge transfer is initiated upon the occurrence of specific misclassifications within certain weak classes. Experimental results show that the accuracy of weak classes can be improved by 10.7%, reducing the inherent inter-class discrepancy effectively.
View on arXiv@article{gan2025_2411.15403, title={ Partial Knowledge Distillation for Alleviating the Inherent Inter-Class Discrepancy in Federated Learning }, author={ Xiaoyu Gan and Jingbo Jiang and Jingyang Zhu and Xiaomeng Wang and Xizi Chen and Chi-Ying Tsui }, journal={arXiv preprint arXiv:2411.15403}, year={ 2025 } }