Explainable post-training bias mitigation with distribution-based fairness metrics

1 April 2025

Abstract

We develop a novel optimization framework with distribution-based fairness constraints for efficiently producing demographically blind, explainable models across a wide range of fairness levels. This is accomplished through post-processing, avoiding the need for retraining. Our framework, which is based on stochastic gradient descent, can be applied to a wide range of model types, with a particular emphasis on the post-processing of gradient-boosted decision trees. Additionally, we design a broad class of interpretable global bias metrics compatible with our method by building on previous work. We empirically test our methodology on a variety of datasets and compare it to other methods.

View on arXiv

@article{franks2025_2504.01223,
  title={ Explainable post-training bias mitigation with distribution-based fairness metrics },
  author={ Ryan Franks and Alexey Miroshnikov },
  journal={arXiv preprint arXiv:2504.01223},
  year={ 2025 }
}

Comments on this paper