76
0

PPO-MI: Efficient Black-Box Model Inversion via Proximal Policy Optimization

Abstract

Model inversion attacks pose a significant privacy risk by attempting to reconstruct private training data from trained models. Most of the existing methods either depend on gradient estimation or require white-box access to model parameters, which limits their applicability in practical scenarios. In this paper, we propose PPO-MI, a novel reinforcement learning-based framework for black-box model inversion attacks. Our approach formulates the inversion task as a Markov Decision Process, where an agent navigates the latent space of a generative model to reconstruct private training samples using only model predictions. By employing Proximal Policy Optimization (PPO) with a momentum-based state transition mechanism, along with a reward function balancing prediction accuracy and exploration, PPO-MI ensures efficient latent space exploration and high query efficiency. We conduct extensive experiments illustrates that PPO-MI outperforms the existing methods while require less attack knowledge, and it is robust across various model architectures and datasets. These results underline its effectiveness and generalizability in practical black-box scenarios, raising important considerations for the privacy vulnerabilities of deployed machine learning models.

View on arXiv
@article{shou2025_2502.14370,
  title={ PPO-MI: Efficient Black-Box Model Inversion via Proximal Policy Optimization },
  author={ Xinpeng Shou },
  journal={arXiv preprint arXiv:2502.14370},
  year={ 2025 }
}
Comments on this paper