14
0

FAIRE: Assessing Racial and Gender Bias in AI-Driven Resume Evaluations

Abstract

In an era where AI-driven hiring is transforming recruitment practices, concerns about fairness and bias have become increasingly important. To explore these issues, we introduce a benchmark, FAIRE (Fairness Assessment In Resume Evaluation), to test for racial and gender bias in large language models (LLMs) used to evaluate resumes across different industries. We use two methods-direct scoring and ranking-to measure how model performance changes when resumes are slightly altered to reflect different racial or gender identities. Our findings reveal that while every model exhibits some degree of bias, the magnitude and direction vary considerably. This benchmark provides a clear way to examine these differences and offers valuable insights into the fairness of AI-based hiring tools. It highlights the urgent need for strategies to reduce bias in AI-driven recruitment. Our benchmark code and dataset are open-sourced at our repository:this https URL.

View on arXiv
@article{wen2025_2504.01420,
  title={ FAIRE: Assessing Racial and Gender Bias in AI-Driven Resume Evaluations },
  author={ Athena Wen and Tanush Patil and Ansh Saxena and Yicheng Fu and Sean O'Brien and Kevin Zhu },
  journal={arXiv preprint arXiv:2504.01420},
  year={ 2025 }
}
Comments on this paper