2
0

Detecting and Mitigating Bias in LLMs through Knowledge Graph-Augmented Training

Abstract

Large language models have revolutionized natural language processing with their surprising capability to understand and generate human-like text. However, many of these models inherit and further amplify the biases present in their training data, raising ethical and fairness concerns. The detection and mitigation of such biases are vital to ensuring that LLMs act responsibly and equitably across diverse domains. This work investigates Knowledge Graph-Augmented Training (KGAT) as a novel method to mitigate bias in LLM. Using structured domain-specific knowledge from real-world knowledge graphs, we improve the understanding of the model and reduce biased output. Public datasets for bias assessment include Gender Shades, Bias in Bios, and FairFace, while metrics such as demographic parity and equal opportunity facilitate rigorous detection. We also performed targeted mitigation strategies to correct biased associations, leading to a significant drop in biased output and improved bias metrics. Equipped with real-world datasets and knowledge graphs, our framework is both scalable and effective, paving the way toward responsible deployment in sensitive and high-stakes applications.

View on arXiv
@article{kumar2025_2504.00310,
  title={ Detecting and Mitigating Bias in LLMs through Knowledge Graph-Augmented Training },
  author={ Rajeev Kumar and Harishankar Kumar and Kumari Shalini },
  journal={arXiv preprint arXiv:2504.00310},
  year={ 2025 }
}
Comments on this paper