LLMs are Biased Teachers: Evaluating LLM Bias in Personalized Education

North American Chapter of the Association for Computational Linguistics (NAACL), 2024

17 October 2024

Main:1 Pages

55 Figures

Bibliography:1 Pages

4 Tables

Appendix:47 Pages

Abstract

With the increasing adoption of large language models (LLMs) in education, concerns about inherent biases in these models have gained prominence. We evaluate LLMs for bias in the personalized educational setting, specifically focusing on the models' roles as "teachers". We reveal significant biases in how models generate and select educational content tailored to different demographic groups, including race, ethnicity, sex, gender, disability status, income, and national origin. We introduce and apply two bias score metrics--Mean Absolute Bias (MAB) and Maximum Difference Bias (MDB)--to analyze 9 open and closed state-of-the-art LLMs. Our experiments, which utilize over 17,000 educational explanations across multiple difficulty levels and topics, uncover that models perpetuate both typical and inverted harmful stereotypes.

View on arXiv

Comments on this paper