The advanced role-playing capabilities of Large Language Models (LLMs) have paved the way for developing Role-Playing Agents (RPAs). However, existing benchmarks in this domain, such as HPD and SocialBench face limitations like poor generalizability, implicit and inaccurate judgments, and the risk of model forgetting. To address the above issues, we propose an automatic, scalable, and generalizable paradigm. Specifically, we construct a benchmark, SHARP, by extracting relations from a general knowledge graph and leveraging the inherent hallucination properties of RPAs to simulate interactions across roles. We employ ChatGPT for stance detection and define relationship hallucination along with three related metrics based on stance transfer. Extensive experiments validate the effectiveness and stability of our paradigm. Our findings further explore the factors influencing these metrics and discuss the trade-off between blind loyalty to relationships and adherence to facts in RPAs.

View on arXiv

Comments on this paper