Monitoring of Static Fairness

3 July 2025

Thomas A. Henzinger

Mahyar Karimi

Konstantin Kueffner

Kaushik Mallik

FaML

ArXiv (abs)PDF HTML

Main:43 Pages

6 Figures

Bibliography:7 Pages

1 Tables

Abstract

Machine-learned systems are in widespread use for making decisions about humans, and it is important that they are fair, i.e., not biased against individuals based on sensitive attributes.We present a general framework of runtime verification of algorithmic fairness for systems whose models are unknown, but are assumed to have a Markov chain structure, with or without full observation of the state space.We introduce a specification language that can model many common algorithmic fairness properties, such as demographic parity, equal opportunity, and social burden.We build monitors that observe a long sequence of events as generated by a given system, and output, after each observation, a quantitative estimate of how fair or biased the system was on that run until that point in time.The estimate is proven to be correct modulo a variable error bound and a given confidence level, where the error bound gets tighter as the observed sequence gets longer.We present two categories of monitoring algorithms, namely ones with a uniform error bound across all time points, and ones with weaker non-uniform, pointwise error bounds at different time points.Our monitoring algorithms use statistical tools that are adapted to suit the dynamic requirements of monitoring and the special needs of the fairness specifications.Using a prototype implementation, we show how we can monitor if a bank is fair in giving loans to applicants from different social backgrounds, and if a college is fair in admitting students while maintaining a reasonable financial burden on the society.In these experiments, our monitors took less than a millisecond to update their verdicts after each observation.

View on arXiv

Comments on this paper