21
0

On the interplay of Explainability, Privacy and Predictive Performance with Explanation-assisted Model Extraction

Abstract

Machine Learning as a Service (MLaaS) has gained important attraction as a means for deploying powerful predictive models, offering ease of use that enables organizations to leverage advanced analytics without substantial investments in specialized infrastructure or expertise. However, MLaaS platforms must be safeguarded against security and privacy attacks, such as model extraction (MEA) attacks. The increasing integration of explainable AI (XAI) within MLaaS has introduced an additional privacy challenge, as attackers can exploit model explanations particularly counterfactual explanations (CFs) to facilitate MEA. In this paper, we investigate the trade offs among model performance, privacy, and explainability when employing Differential Privacy (DP), a promising technique for mitigating CF facilitated MEA. We evaluate two distinct DP strategies: implemented during the classification model training and at the explainer during CF generation.

View on arXiv
@article{ezzeddine2025_2505.08847,
  title={ On the interplay of Explainability, Privacy and Predictive Performance with Explanation-assisted Model Extraction },
  author={ Fatima Ezzeddine and Rinad Akel and Ihab Sbeity and Silvia Giordano and Marc Langheinrich and Omran Ayoub },
  journal={arXiv preprint arXiv:2505.08847},
  year={ 2025 }
}
Comments on this paper