69
v1v2v3 (latest)

FAIRGAMER: Evaluating Social Biases in LLM-Based Video Game NPCs

Yuanxiang Wang
Songlin Hu
Xiaodan Zhang
Zhongjiang Yao
Main:8 Pages
16 Figures
Bibliography:4 Pages
6 Tables
Appendix:6 Pages
Abstract

Large Language Models (LLMs) have increasingly enhanced or replaced traditional Non-Player Characters (NPCs) in video games. However, these LLM-based NPCs inherit underlying social biases (e.g., race or class), posing fairness risks during in-game interactions. To address the limited exploration of this issue, we introduce FairGamer, the first benchmark to evaluate social biases across three interaction patterns: transaction, cooperation, and competition. FairGamer assesses four bias types, including class, race, age, and nationality, across 12 distinct evaluation tasks using a novel metric, FairMCV. Our evaluation of seven frontier LLMs reveals that: (1) models exhibit biased decision-making, with Grok-4-Fast demonstrating the highest bias (average FairMCV = 76.9%); and (2) larger LLMs display more severe social biases, suggesting that increased model capacity inadvertently amplifies these biases. We release FairGamer atthis https URLto facilitate future research on NPC fairness.

View on arXiv
Comments on this paper