Personalized Federated Training of Diffusion Models with Privacy Guarantees

The scarcity of accessible, compliant, and ethically sourced data presents a considerable challenge to the adoption of artificial intelligence (AI) in sensitive fields like healthcare, finance, and biomedical research. Furthermore, access to unrestricted public datasets is increasingly constrained due to rising concerns over privacy, copyright, and competition. Synthetic data has emerged as a promising alternative, and diffusion models -- a cutting-edge generative AI technology -- provide an effective solution for generating high-quality and diverse synthetic data. In this paper, we introduce a novel federated learning framework for training diffusion models on decentralized private datasets. Our framework leverages personalization and the inherent noise in the forward diffusion process to produce high-quality samples while ensuring robust differential privacy guarantees. Our experiments show that our framework outperforms non-collaborative training methods, particularly in settings with high data heterogeneity, and effectively reduces biases and imbalances in synthetic data, resulting in fairer downstream models.
View on arXiv@article{patel2025_2504.00952, title={ Personalized Federated Training of Diffusion Models with Privacy Guarantees }, author={ Kumar Kshitij Patel and Weitong Zhang and Lingxiao Wang }, journal={arXiv preprint arXiv:2504.00952}, year={ 2025 } }