Targeted Unlearning Using Perturbed Sign Gradient Methods With Applications On Medical Images

28 May 2025

Main:18 Pages

13 Figures

Bibliography:4 Pages

11 Tables

Appendix:17 Pages

Abstract

Machine unlearning aims to remove the influence of specific training samples from a trained model without full retraining. While prior work has largely focused on privacy-motivated settings, we recast unlearning as a general-purpose tool for post-deployment model revision. Specifically, we focus on utilizing unlearning in clinical contexts where data shifts, device deprecation, and policy changes are common. To this end, we propose a bilevel optimization formulation of boundary-based unlearning that can be solved using iterative algorithms. We provide convergence guarantees when first-order algorithms are used to unlearn. Our method introduces tunable loss design for controlling the forgetting-retention tradeoff and supports novel model composition strategies that merge the strengths of distinct unlearning runs. Across benchmark and real-world clinical imaging datasets, our approach outperforms baselines on both forgetting and retention metrics, including scenarios involving imaging devices and anatomical outliers. This work establishes machine unlearning as a modular, practical alternative to retraining for real-world model maintenance in clinical applications.

View on arXiv

@article{nahass2025_2505.21872,
  title={ Targeted Unlearning Using Perturbed Sign Gradient Methods With Applications On Medical Images },
  author={ George R. Nahass and Zhu Wang and Homa Rashidisabet and Won Hwa Kim and Sasha Hubschman and Jeffrey C. Peterson and Ghasem Yazdanpanah and Chad A. Purnell and Pete Setabutr and Ann Q. Tran and Darvin Yi and Sathya N. Ravi },
  journal={arXiv preprint arXiv:2505.21872},
  year={ 2025 }
}

Comments on this paper