
Training deep neural networks via federated learning allows clients to share the model updated on their data, instead of the original data. In practice, it is shown that a client's private information, unrelated to the main learning task, can be discovered from the shared model, which compromises the promised privacy protection. However, there is still no formal approach for quantifying the leakage of such latent information from the shared model/gradients. As a solution, we introduce and evaluate two mathematically-grounded metrics for better characterizing the amount of information included in the shared gradients computed on the clients' private data. First, using an adaptation of the empirical -information, we show how to quantify the amount of private latent information captured in gradients that are usable for an attacker. Second, based on a sensitivity analysis}using Jacobian matrices, we show how to measure changes in the gradients with respect to latent information. Further, we show the applicability of our proposed metrics in (i) localizing private latent information in a layer-wise manner, in both settings where we have or we do not have the knowledge of the attackers' capability, and (ii) comparing the capacity of each layer in a neural network in capturing higher-level versus lower-level latent information. Experimental results on three real-world datasets using three benchmark models show the validity of the proposed metrics.
View on arXiv