v1v2 (latest)

You don't need a personality test to know these models are unreliable: Assessing the Reliability of Large Language Models on Psychometric Instruments

16 November 2023

Papers citing "You don't need a personality test to know these models are unreliable: Assessing the Reliability of Large Language Models on Psychometric Instruments"

31 / 31 papers shown

Social Perceptions of English Spelling Variation on Twitter: A Comparative Analysis of Human and LLM Responses

Dong Nguyen

Laura Rosseel

28 Nov 2025

Do Psychometric Tests Work for Large Language Models? Evaluation of Tests on Sexism, Racism, and Morality

13 Oct 2025

One Model, Many Morals: Uncovering Cross-Linguistic Misalignments in Computational Moral Reasoning

146

25 Sep 2025

Established Psychometric vs. Ecologically Valid Questionnaires: Rethinking Psychological Assessments in Large Language Models

12 Sep 2025

CAPE: Context-Aware Personality Evaluation Framework for Large Language Models

1.2K

28 Aug 2025

The Prompt Makes the Person(a): A Systematic Evaluation of Sociodemographic Persona Prompting for Large Language Models

383

21 Jul 2025

Improving LLM Reasoning through Interpretable Role-Playing Steering

238

09 Jun 2025

Understanding and Meeting Practitioner Needs When Measuring Representational Harms Caused by LLM-Based SystemsAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

Emma Harvey

Emily Sheng

Su Lin Blodgett

Alexandra Chouldechova

Jean Garcia-Gathright

Alexandra Olteanu

Hanna M. Wallach

206

04 Jun 2025

Are Economists Always More Introverted? Analyzing Consistency in Persona-Assigned LLMs

Manon Reusens

Bart Baesens

David Jurgens

253

03 Jun 2025

Do Language Models Think Consistently? A Study of Value Preferences Across Varying Response Lengths

Inderjeet Nair

Lu Wang

199

03 Jun 2025

Position is Power: System Prompts as a Mechanism of Bias in Large Language Models (LLMs)Conference on Fairness, Accountability and Transparency (FAccT), 2025

348

27 May 2025

What's the Difference? Supporting Users in Identifying the Effects of Prompt and Model Changes Through Token PatternsAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

337

22 Apr 2025

A Review of Incorporating Psychological Theories in LLMs

1.0K

28 Mar 2025

Only a Little to the Left: A Theory-grounded Measure of Political Bias in Large Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

350

20 Mar 2025

R.U.Psycho? Robust Unified Psychometric Testing of Language Models

282

13 Mar 2025

Large Language Models Often Say One Thing and Do AnotherInternational Conference on Learning Representations (ICLR), 2025

255

10 Mar 2025

The Call for Socially Aware Language Technologies

402

24 Feb 2025

Pragmatics in the Era of Large Language Models: A Survey on Datasets, Evaluation, Opportunities and ChallengesAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

409

17 Feb 2025

Does Prompt Formatting Have Any Impact on LLM Performance?

311

141

15 Nov 2024

A dataset of questions on decision-theoretic reasoning in Newcomb-like problems

548

15 Nov 2024

BIG5-CHAT: Shaping LLM Personalities Through Training on Human-Grounded DataAnnual Meeting of the Association for Computational Linguistics (ACL), 2024

415

21 Oct 2024

Cognitive phantoms in LLMs through the lens of latent variablesComputers in Human Behavior (CHB), 2024

Sanne Peereboom

Inga Schwabe

Bennett Kleinberg

160

06 Sep 2024

Training LLMs to Recognize Hedges in Spontaneous Narratives

211

06 Aug 2024

Self-assessment, Exhibition, and Recognition: a Review of Personality in Large Language Models

Jiannong Cao

Shuaiqi Liu

255

25 Jun 2024

Cultural Value Differences of LLMs: Prompt, Language, and Model Size

Qishuai Zhong

Yike Yun

Aixin Sun

195

17 Jun 2024

The Potential and Challenges of Evaluating Attitudes, Opinions, and Values in Large Language Models

Barbara Plank

297

16 Jun 2024

Building Better AI Agents: A Provocation on the Utilisation of Persona in LLM-based Conversational Agents

Guangzhi Sun

Xiao Zhan

Jose Such

251

26 May 2024

Beyond prompt brittleness: Evaluating the reliability and consistency of political worldviews in LLMs

267

27 Feb 2024

Political Compass or Spinning Arrow? Towards More Meaningful Evaluations for Values and Opinions in Large Language Models

Paul Röttger

Hinrich Schütze

279

128

26 Feb 2024

MM-Soc: Benchmarking Multimodal Large Language Models in Social Media Platforms

285

21 Feb 2024

Revisiting the Reliability of Psychological Scales on Large Language Models

Michael R. Lyu

363

31 May 2023