98
53

Fair Active Learning

Abstract

Bias in training data, as well as proxy attributes, are probably the main reasons for unfair machine learning outcomes. ML models are trained on historical data that are problematic due to the inherent societal bias. Besides, collecting labeled data in societal applications is challenging and costly. Subsequently, proxy attributes are often used as alternatives to labels. Yet, biased proxies cause model unfairness. In this paper, we introduce fair active learning (FAL) as a resolution. Considering a limited labeling budget, FAL carefully selects data points to be labeled in order to balance the model performance and fairness. Our comprehensive experiments on real datasets, confirm a significant fairness improvement while maintaining the model performance.

View on arXiv
Comments on this paper