Papers citing 'You don't need a personality test to know these models are unreliable: Assessing the Reliability of Large Language Models on Psychometric Instruments'

Title
Social Perceptions of English Spelling Variation on Twitter: A Comparative Analysis of Human and LLM Responses Dong Nguyen Laura Rosseel 44 0 0 28 Nov 2025
Do Psychometric Tests Work for Large Language Models? Evaluation of Tests on Sexism, Racism, and Morality Jana Jung Marlene Lutz Indira Sen M. Strohmaier 84 0 0 13 Oct 2025
One Model, Many Morals: Uncovering Cross-Linguistic Misalignments in Computational Moral Reasoning Sualeha Farid Jayden Lin Zean Chen Shivani Kumar David Jurgens LRM 140 1 0 25 Sep 2025
Established Psychometric vs. Ecologically Valid Questionnaires: Rethinking Psychological Assessments in Large Language Models Dongmin Choi Woojung Song Jongwook Han Eun-Ju Lee Yohan Jo 88 0 0 12 Sep 2025
CAPE: Context-Aware Personality Evaluation Framework for Large Language Models Jivnesh Sandhan Fei Cheng Tushar Sandhan Yugo Murawaki LLMAG 1.2K 0 0 28 Aug 2025
The Prompt Makes the Person(a): A Systematic Evaluation of Sociodemographic Persona Prompting for Large Language Models Marlene Lutz Indira Sen Georg Ahnert Elisa Rogers M. Strohmaier 343 11 0 21 Jul 2025
Improving LLM Reasoning through Interpretable Role-Playing Steering Anyi Wang Dong Shu Yifan Wang Yunpu Ma Mengnan Du LLMSV LRM 215 3 0 09 Jun 2025
Understanding and Meeting Practitioner Needs When Measuring Representational Harms Caused by LLM-Based SystemsAnnual Meeting of the Association for Computational Linguistics (ACL), 2025 Emma Harvey Emily Sheng Su Lin Blodgett Alexandra Chouldechova Jean Garcia-Gathright Alexandra Olteanu Hanna M. Wallach 203 3 0 04 Jun 2025
Are Economists Always More Introverted? Analyzing Consistency in Persona-Assigned LLMs Manon Reusens Bart Baesens David Jurgens 248 0 0 03 Jun 2025
Do Language Models Think Consistently? A Study of Value Preferences Across Varying Response Lengths Inderjeet Nair Lu Wang 183 1 0 03 Jun 2025
Position is Power: System Prompts as a Mechanism of Bias in Large Language Models (LLMs)Conference on Fairness, Accountability and Transparency (FAccT), 2025 Anna Neumann Elisabeth Kirsten Muhammad Bilal Zafar Jatinder Singh 344 7 0 27 May 2025
What's the Difference? Supporting Users in Identifying the Effects of Prompt and Model Changes Through Token PatternsAnnual Meeting of the Association for Computational Linguistics (ACL), 2025 Michael A. Hedderich Anyi Wang Raoyuan Zhao Florian Eichin Jonas Fischer Barbara Plank 320 3 0 22 Apr 2025
The Mind in the Machine: A Survey of Incorporating Psychological Theories in LLMs Zizhou Liu Ziwei Gong Lin Ai Zheng Hui Run Chen Colin Wayne Leach Michelle R. Greene Julia Hirschberg LLMAG 981 5 0 28 Mar 2025
Only a Little to the Left: A Theory-grounded Measure of Political Bias in Large Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2025 Mats Faulborn Indira Sen Max Pellert Andreas Spitz David Garcia ELM 333 5 0 20 Mar 2025
R.U.Psycho? Robust Unified Psychometric Testing of Language Models Julian Schelb Orr Borin David Garcia Andreas Spitz 263 1 0 13 Mar 2025
Large Language Models Often Say One Thing and Do AnotherInternational Conference on Learning Representations (ICLR), 2025 Ruoxi Xu Hongyu Lin Jia Zheng Jia Zheng Weixiang Zhou Le Sun Yingfei Sun 241 4 0 10 Mar 2025
The Call for Socially Aware Language Technologies Diyi Yang Dirk Hovy David Jurgens Barbara Plank VLM 394 14 0 24 Feb 2025
Pragmatics in the Era of Large Language Models: A Survey on Datasets, Evaluation, Opportunities and ChallengesAnnual Meeting of the Association for Computational Linguistics (ACL), 2025 Bolei Ma Yuting Li Wei Zhou Ziwei Gong Wenshu Fan Katja Jasinskaja Annemarie Friedrich Julia Hirschberg Frauke Kreuter Barbara Plank ELM LRM 393 15 0 17 Feb 2025
Does Prompt Formatting Have Any Impact on LLM Performance? Jia He Mukund Rungta David Koleczek Arshdeep Sekhon Franklin X Wang Sadid Hasan LLMAG LRM 298 136 0 15 Nov 2024
A dataset of questions on decision-theoretic reasoning in Newcomb-like problems Caspar Oesterheld Emery Cooper Miles Kodama Linh Chi Nguyen Ethan Perez 530 1 0 15 Nov 2024
BIG5-CHAT: Shaping LLM Personalities Through Training on Human-Grounded DataAnnual Meeting of the Association for Computational Linguistics (ACL), 2024 Wenkai Li Jiarui Liu Andy Liu Xuhui Zhou Mona Diab Maarten Sap 401 31 0 21 Oct 2024
Cognitive phantoms in LLMs through the lens of latent variablesComputers in Human Behavior (CHB), 2024 Sanne Peereboom Inga Schwabe Bennett Kleinberg 150 1 0 06 Sep 2024
Training LLMs to Recognize Hedges in Spontaneous Narratives Amie Paige Adil Soubki John Murzaku Owen Rambow Susan E. Brennan 207 1 0 06 Aug 2024
Self-assessment, Exhibition, and Recognition: a Review of Personality in Large Language Models Zhiyuan Wen Yu Yang Jiannong Cao Haoming Sun Ruosong Yang Shuaiqi Liu 246 7 0 25 Jun 2024
Cultural Value Differences of LLMs: Prompt, Language, and Model Size Qishuai Zhong Yike Yun Aixin Sun 169 9 0 17 Jun 2024
The Potential and Challenges of Evaluating Attitudes, Opinions, and Values in Large Language Models Bolei Ma Xinpeng Wang Tiancheng Hu Anna Haensch Michael A. Hedderich Barbara Plank Frauke Kreuter ALM 282 16 0 16 Jun 2024
Building Better AI Agents: A Provocation on the Utilisation of Persona in LLM-based Conversational Agents Guangzhi Sun Xiao Zhan Jose Such 250 56 0 26 May 2024
Beyond prompt brittleness: Evaluating the reliability and consistency of political worldviews in LLMs Tanise Ceron Neele Falk Ana Barić Dmitry Nikolaev Sebastian Padó 251 35 0 27 Feb 2024
Political Compass or Spinning Arrow? Towards More Meaningful Evaluations for Values and Opinions in Large Language Models Paul Röttger Valentin Hofmann Valentina Pyatkin Musashi Hinck Hannah Rose Kirk Hinrich Schütze Dirk Hovy ELM 272 124 0 26 Feb 2024
MM-Soc: Benchmarking Multimodal Large Language Models in Social Media Platforms Yiqiao Jin Minje Choi Gaurav Verma Yongfeng Zhang Srijan Kumar 278 27 0 21 Feb 2024
Revisiting the Reliability of Psychological Scales on Large Language Models Shu Yang Wenxuan Wang Man Ho Lam E. Li Wenxiang Jiao Michael R. Lyu 346 24 0 31 May 2023