ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2311.09718
  4. Cited By
You don't need a personality test to know these models are unreliable:
  Assessing the Reliability of Large Language Models on Psychometric
  Instruments
v1v2 (latest)

You don't need a personality test to know these models are unreliable: Assessing the Reliability of Large Language Models on Psychometric Instruments

16 November 2023
Bangzhao Shu
Lechen Zhang
Minje Choi
Lavinia Dunagan
Lajanugen Logeswaran
Moontae Lee
Dallas Card
David Jurgens
ArXiv (abs)PDFHTML

Papers citing "You don't need a personality test to know these models are unreliable: Assessing the Reliability of Large Language Models on Psychometric Instruments"

31 / 31 papers shown
Social Perceptions of English Spelling Variation on Twitter: A Comparative Analysis of Human and LLM Responses
Social Perceptions of English Spelling Variation on Twitter: A Comparative Analysis of Human and LLM Responses
Dong Nguyen
Laura Rosseel
81
0
0
28 Nov 2025
Do Psychometric Tests Work for Large Language Models? Evaluation of Tests on Sexism, Racism, and Morality
Do Psychometric Tests Work for Large Language Models? Evaluation of Tests on Sexism, Racism, and Morality
Jana Jung
Marlene Lutz
Indira Sen
M. Strohmaier
88
0
0
13 Oct 2025
One Model, Many Morals: Uncovering Cross-Linguistic Misalignments in Computational Moral Reasoning
One Model, Many Morals: Uncovering Cross-Linguistic Misalignments in Computational Moral Reasoning
Sualeha Farid
Jayden Lin
Zean Chen
Shivani Kumar
David Jurgens
LRM
146
1
0
25 Sep 2025
Established Psychometric vs. Ecologically Valid Questionnaires: Rethinking Psychological Assessments in Large Language Models
Established Psychometric vs. Ecologically Valid Questionnaires: Rethinking Psychological Assessments in Large Language Models
Dongmin Choi
Woojung Song
Jongwook Han
Eun-Ju Lee
Yohan Jo
94
0
0
12 Sep 2025
CAPE: Context-Aware Personality Evaluation Framework for Large Language Models
CAPE: Context-Aware Personality Evaluation Framework for Large Language Models
Jivnesh Sandhan
Fei Cheng
Tushar Sandhan
Yugo Murawaki
LLMAG
1.2K
1
0
28 Aug 2025
The Prompt Makes the Person(a): A Systematic Evaluation of Sociodemographic Persona Prompting for Large Language Models
The Prompt Makes the Person(a): A Systematic Evaluation of Sociodemographic Persona Prompting for Large Language Models
Marlene Lutz
Indira Sen
Georg Ahnert
Elisa Rogers
M. Strohmaier
383
11
0
21 Jul 2025
Improving LLM Reasoning through Interpretable Role-Playing Steering
Improving LLM Reasoning through Interpretable Role-Playing Steering
Anyi Wang
Dong Shu
Yifan Wang
Yunpu Ma
Mengnan Du
LLMSVLRM
238
4
0
09 Jun 2025
Understanding and Meeting Practitioner Needs When Measuring Representational Harms Caused by LLM-Based SystemsAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Emma Harvey
Emily Sheng
Su Lin Blodgett
Alexandra Chouldechova
Jean Garcia-Gathright
Alexandra Olteanu
Hanna M. Wallach
206
3
0
04 Jun 2025
Are Economists Always More Introverted? Analyzing Consistency in Persona-Assigned LLMs
Are Economists Always More Introverted? Analyzing Consistency in Persona-Assigned LLMs
Manon Reusens
Bart Baesens
David Jurgens
253
0
0
03 Jun 2025
Do Language Models Think Consistently? A Study of Value Preferences Across Varying Response Lengths
Do Language Models Think Consistently? A Study of Value Preferences Across Varying Response Lengths
Inderjeet Nair
Lu Wang
199
1
0
03 Jun 2025
Position is Power: System Prompts as a Mechanism of Bias in Large Language Models (LLMs)
Position is Power: System Prompts as a Mechanism of Bias in Large Language Models (LLMs)Conference on Fairness, Accountability and Transparency (FAccT), 2025
Anna Neumann
Elisabeth Kirsten
Muhammad Bilal Zafar
Jatinder Singh
348
8
0
27 May 2025
What's the Difference? Supporting Users in Identifying the Effects of Prompt and Model Changes Through Token Patterns
What's the Difference? Supporting Users in Identifying the Effects of Prompt and Model Changes Through Token PatternsAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Michael A. Hedderich
Anyi Wang
Raoyuan Zhao
Florian Eichin
Jonas Fischer
Barbara Plank
337
3
0
22 Apr 2025
A Review of Incorporating Psychological Theories in LLMs
A Review of Incorporating Psychological Theories in LLMs
Zizhou Liu
Ziwei Gong
Lin Ai
Zheng Hui
Run Chen
Colin Wayne Leach
Michelle R. Greene
Julia Hirschberg
LLMAG
1.0K
5
0
28 Mar 2025
Only a Little to the Left: A Theory-grounded Measure of Political Bias in Large Language Models
Only a Little to the Left: A Theory-grounded Measure of Political Bias in Large Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Mats Faulborn
Indira Sen
Max Pellert
Andreas Spitz
David Garcia
ELM
350
6
0
20 Mar 2025
R.U.Psycho? Robust Unified Psychometric Testing of Language Models
Julian Schelb
Orr Borin
David Garcia
Andreas Spitz
282
1
0
13 Mar 2025
Large Language Models Often Say One Thing and Do AnotherInternational Conference on Learning Representations (ICLR), 2025
Ruoxi Xu
Hongyu Lin
Jia Zheng
Jia Zheng
Weixiang Zhou
Le Sun
Yingfei Sun
255
4
0
10 Mar 2025
The Call for Socially Aware Language Technologies
The Call for Socially Aware Language Technologies
Diyi Yang
Dirk Hovy
David Jurgens
Barbara Plank
VLM
402
14
0
24 Feb 2025
Pragmatics in the Era of Large Language Models: A Survey on Datasets, Evaluation, Opportunities and Challenges
Pragmatics in the Era of Large Language Models: A Survey on Datasets, Evaluation, Opportunities and ChallengesAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Bolei Ma
Yuting Li
Wei Zhou
Ziwei Gong
Wenshu Fan
Katja Jasinskaja
Annemarie Friedrich
Julia Hirschberg
Frauke Kreuter
Barbara Plank
ELMLRM
409
15
0
17 Feb 2025
Does Prompt Formatting Have Any Impact on LLM Performance?
Does Prompt Formatting Have Any Impact on LLM Performance?
Jia He
Mukund Rungta
David Koleczek
Arshdeep Sekhon
Franklin X Wang
Sadid Hasan
LLMAGLRM
311
141
0
15 Nov 2024
A dataset of questions on decision-theoretic reasoning in Newcomb-like problems
A dataset of questions on decision-theoretic reasoning in Newcomb-like problems
Caspar Oesterheld
Emery Cooper
Miles Kodama
Linh Chi Nguyen
Ethan Perez
548
1
0
15 Nov 2024
BIG5-CHAT: Shaping LLM Personalities Through Training on Human-Grounded Data
BIG5-CHAT: Shaping LLM Personalities Through Training on Human-Grounded DataAnnual Meeting of the Association for Computational Linguistics (ACL), 2024
Wenkai Li
Jiarui Liu
Andy Liu
Xuhui Zhou
Mona Diab
Maarten Sap
415
37
0
21 Oct 2024
Cognitive phantoms in LLMs through the lens of latent variables
Cognitive phantoms in LLMs through the lens of latent variablesComputers in Human Behavior (CHB), 2024
Sanne Peereboom
Inga Schwabe
Bennett Kleinberg
160
1
0
06 Sep 2024
Training LLMs to Recognize Hedges in Spontaneous Narratives
Training LLMs to Recognize Hedges in Spontaneous Narratives
Amie Paige
Adil Soubki
John Murzaku
Owen Rambow
Susan E. Brennan
211
1
0
06 Aug 2024
Self-assessment, Exhibition, and Recognition: a Review of Personality in
  Large Language Models
Self-assessment, Exhibition, and Recognition: a Review of Personality in Large Language Models
Zhiyuan Wen
Yu Yang
Jiannong Cao
Haoming Sun
Ruosong Yang
Shuaiqi Liu
255
7
0
25 Jun 2024
Cultural Value Differences of LLMs: Prompt, Language, and Model Size
Cultural Value Differences of LLMs: Prompt, Language, and Model Size
Qishuai Zhong
Yike Yun
Aixin Sun
195
9
0
17 Jun 2024
The Potential and Challenges of Evaluating Attitudes, Opinions, and
  Values in Large Language Models
The Potential and Challenges of Evaluating Attitudes, Opinions, and Values in Large Language Models
Bolei Ma
Xinpeng Wang
Tiancheng Hu
Anna Haensch
Michael A. Hedderich
Barbara Plank
Frauke Kreuter
ALM
297
19
0
16 Jun 2024
Building Better AI Agents: A Provocation on the Utilisation of Persona
  in LLM-based Conversational Agents
Building Better AI Agents: A Provocation on the Utilisation of Persona in LLM-based Conversational Agents
Guangzhi Sun
Xiao Zhan
Jose Such
251
56
0
26 May 2024
Beyond prompt brittleness: Evaluating the reliability and consistency of
  political worldviews in LLMs
Beyond prompt brittleness: Evaluating the reliability and consistency of political worldviews in LLMs
Tanise Ceron
Neele Falk
Ana Barić
Dmitry Nikolaev
Sebastian Padó
267
37
0
27 Feb 2024
Political Compass or Spinning Arrow? Towards More Meaningful Evaluations
  for Values and Opinions in Large Language Models
Political Compass or Spinning Arrow? Towards More Meaningful Evaluations for Values and Opinions in Large Language Models
Paul Röttger
Valentin Hofmann
Valentina Pyatkin
Musashi Hinck
Hannah Rose Kirk
Hinrich Schütze
Dirk Hovy
ELM
279
128
0
26 Feb 2024
MM-Soc: Benchmarking Multimodal Large Language Models in Social Media
  Platforms
MM-Soc: Benchmarking Multimodal Large Language Models in Social Media Platforms
Yiqiao Jin
Minje Choi
Gaurav Verma
Yongfeng Zhang
Srijan Kumar
285
28
0
21 Feb 2024
Revisiting the Reliability of Psychological Scales on Large Language
  Models
Revisiting the Reliability of Psychological Scales on Large Language Models
Shu Yang
Wenxuan Wang
Man Ho Lam
E. Li
Wenxiang Jiao
Michael R. Lyu
363
24
0
31 May 2023
1
Page 1 of 1