SGDE: Secure Generative Data Exchange for Cross-Silo Federated Learning

International Conference on Artificial Intelligence and Pattern Recognition (AIPR), 2021

24 September 2021

Matteo Matteucci

Abstract

Privacy regulation laws, such as GDPR, impose transparency and security as design pillars for data processing algorithms. In this context, federated learning is one of the most influential frameworks for privacy-preserving distributed machine learning, achieving astounding results in many natural language processing and computer vision tasks. Several federated learning frameworks employ differential privacy to prevent private data leakage to unauthorized parties and malicious attackers. Many studies, however, highlight the vulnerabilities of standard federated learning to poisoning and inference, thus, raising concerns about potential risks for sensitive data. To address this issue, we present SGDE, a generative data exchange protocol that improves user security and machine learning performance in a cross-silo federation. The core of SGDE is to share data generators with strong differential privacy guarantees trained on private data instead of communicating explicit gradient information. These generators synthesize an arbitrarily large amount of data that retain the distinctive features of private samples but differ substantially. We show how the inclusion of SGDE into a cross-silo federated network improves resilience to the most influential attacks to federated learning. We test our approach on images and tabular datasets, exploiting beta-variational autoencoders as data generators and highlighting fairness and performance improvements over local and federated learning on non-generated data.

View on arXiv

Comments on this paper