Dissecting Racial Bias in a Credit Scoring System Experimentally Developed for the Brazilian Population

11 November 2020

Abstract

We dissect an experimental credit scoring model developed with real data and demonstrate -- without having access to protected attributes -- how the use of location information introduces racial bias. We analyze the tree gradient boosting model with the aid of a game-theoretic ML explainability technique, counterfactual experiments and Brazilian census data. The present experiment testifies to the importance of developing methods and language that goes beyond the need of access to protected attributes when auditing ML models, the necessity of considering regional specifics when reflecting on racial issues, and the importance of census data to the AI research community. To the best of our knowledge, this is the first documented case of how algorithmic racial bias may easily emerge in a ML credit scoring model built with Brazilian data, a country with the largest Black population outside Africa.

View on arXiv

Comments on this paper