ValueCompass: A Framework for Measuring Contextual Value Alignment Between Human and LLMs

As AI systems become more advanced, ensuring their alignment with a diverse range of individuals and societal values becomes increasingly critical. But how can we capture fundamental human values and assess the degree to which AI systems align with them? We introduce ValueCompass, a framework of fundamental values, grounded in psychological theory and a systematic review, to identify and evaluate human-AI alignment. We apply ValueCompass to measure the value alignment of humans and large language models (LLMs) across four real-world scenarios: collaborative writing, education, public sectors, and healthcare. Our findings reveal concerning misalignments between humans and LLMs, such as humans frequently endorse values like "National Security" which were largely rejected by LLMs. We also observe that values differ across scenarios, highlighting the need for context-aware AI alignment strategies. This work provides valuable insights into the design space of human-AI alignment, laying the foundations for developing AI systems that responsibly reflect societal values and ethics.
View on arXiv@article{shen2025_2409.09586, title={ ValueCompass: A Framework for Measuring Contextual Value Alignment Between Human and LLMs }, author={ Hua Shen and Tiffany Knearem and Reshmi Ghosh and Yu-Ju Yang and Nicholas Clark and Tanushree Mitra and Yun Huang }, journal={arXiv preprint arXiv:2409.09586}, year={ 2025 } }