ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2302.02083
  4. Cited By
Evaluating Large Language Models in Theory of Mind Tasks
v1v2v3v4v5v6 (latest)

Evaluating Large Language Models in Theory of Mind Tasks

Proceedings of the National Academy of Sciences of the United States of America (PNAS), 2023
4 February 2023
Michal Kosinskihttps://www.semanticscholar.org/me/account
    LLMAGLRM
ArXiv (abs)PDFHTML

Papers citing "Evaluating Large Language Models in Theory of Mind Tasks"

50 / 109 papers shown
Title
Tacit Bidder-Side Collusion: Artificial Intelligence in Dynamic Auctions
Tacit Bidder-Side Collusion: Artificial Intelligence in Dynamic Auctions
Sriram Tolety
48
0
0
26 Nov 2025
Mind the Motions: Benchmarking Theory-of-Mind in Everyday Body Language
Seungbeen Lee
Jinhong Jeong
Donghyun Kim
Yejin Son
Youngjae Yu
94
1
0
19 Nov 2025
From Passive to Persuasive: Steering Emotional Nuance in Human-AI Negotiation
From Passive to Persuasive: Steering Emotional Nuance in Human-AI Negotiation
Niranjan Chebrolu
Gerard Christopher Yeo
Kokil Jaidka
LLMSV
196
0
0
16 Nov 2025
Pruning as Regularization: Sensitivity-Aware One-Shot Pruning in ASR
Pruning as Regularization: Sensitivity-Aware One-Shot Pruning in ASR
Julian Irigoyen
Arthur Söhler
Andreas Søeborg Kirkedal
108
2
0
11 Nov 2025
Social Simulations with Large Language Model Risk Utopian Illusion
Social Simulations with Large Language Model Risk Utopian Illusion
Ning Bian
Xianpei Han
Hongyu Lin
Baolei Wu
Jun Wang
72
0
0
24 Oct 2025
Are Large Language Models Sensitive to the Motives Behind Communication?
Are Large Language Models Sensitive to the Motives Behind Communication?
Addison J. Wu
Ryan Liu
Kerem Oktar
T. Sumers
Thomas L. Griffiths
156
0
0
22 Oct 2025
DPRF: A Generalizable Dynamic Persona Refinement Framework for Optimizing Behavior Alignment Between Personalized LLM Role-Playing Agents and Humans
DPRF: A Generalizable Dynamic Persona Refinement Framework for Optimizing Behavior Alignment Between Personalized LLM Role-Playing Agents and Humans
Bingsheng Yao
Bo Sun
Yuanzhe Dong
Yuxuan Lu
Dakuo Wang
313
0
0
16 Oct 2025
Doing Things with Words: Rethinking Theory of Mind Simulation in Large Language Models
Doing Things with Words: Rethinking Theory of Mind Simulation in Large Language Models
A. Lombardi
Alessandro Lenci
LLMAG
128
1
0
15 Oct 2025
Do You Get the Hint? Benchmarking LLMs on the Board Game Concept
Do You Get the Hint? Benchmarking LLMs on the Board Game Concept
I. Gevers
Walter Daelemans
LRM
72
0
0
15 Oct 2025
Circuit Distillation
Circuit Distillation
Somin Wadhwa
Silvio Amir
Byron C. Wallace
133
0
0
29 Sep 2025
Infusing Theory of Mind into Socially Intelligent LLM Agents
Infusing Theory of Mind into Socially Intelligent LLM Agents
EunJeong Hwang
Yuwei Yin
Giuseppe Carenini
Peter West
Vered Shwartz
LLMAG
1.6K
1
0
26 Sep 2025
LVLMs are Bad at Overhearing Human Referential Communication
LVLMs are Bad at Overhearing Human Referential Communication
Zhengxiang Wang
Weiling Li
Panagiotis Kaliosis
Owen Rambow
Susan E. Brennan
125
1
0
15 Sep 2025
Preservation of Language Understanding Capabilities in Speech-aware Large Language Models
Preservation of Language Understanding Capabilities in Speech-aware Large Language Models
Marek Kubis
Paweł Skórzewski
Iwona Christop
Mateusz Czyżnikiewicz
Jakub Kubiak
Łukasz Bondaruk
Marcin Lewandowski
AuLLMELM
178
0
0
15 Sep 2025
One Model, Two Minds: A Context-Gated Graph Learner that Recreates Human Biases
One Model, Two Minds: A Context-Gated Graph Learner that Recreates Human Biases
Shalima Binta Manir
Tim Oates
68
0
0
10 Sep 2025
Drivel-ology: Challenging LLMs with Interpreting Nonsense with Depth
Drivel-ology: Challenging LLMs with Interpreting Nonsense with Depth
Yang Wang
Chenghao Xiao
Chia-Yi Hsiao
Zi Yan Chang
Chi-Li Chen
Tyler Loakman
Chenghua Lin
235
1
0
04 Sep 2025
The Personality Illusion: Revealing Dissociation Between Self-Reports & Behavior in LLMs
The Personality Illusion: Revealing Dissociation Between Self-Reports & Behavior in LLMs
Pengrui Han
Rafal Kocielnik
Peiyang Song
Ramit Debnath
Dean Mobbs
Anima Anandkumar
R. Alvarez
318
5
0
03 Sep 2025
LLMs and their Limited Theory of Mind: Evaluating Mental State Annotations in Situated Dialogue
LLMs and their Limited Theory of Mind: Evaluating Mental State Annotations in Situated Dialogue
Katharine Kowalyshyn
Matthias Scheutz
88
0
0
02 Sep 2025
Bridging Minds and Machines: Toward an Integration of AI and Cognitive Science
Bridging Minds and Machines: Toward an Integration of AI and Cognitive Science
Rui Mao
Qian Liu
Xiao Li
Erik Cambria
Amir Hussain
AI4CE
100
0
0
28 Aug 2025
Who Sees What? Structured Thought-Action Sequences for Epistemic Reasoning in LLMs
Who Sees What? Structured Thought-Action Sequences for Epistemic Reasoning in LLMs
Luca Annese
Sabrina Patania
Silvia Serino
Tom Foulsham
Silvia Rossi
Azzurra Ruggeri
Dimitri Ognibene
LRM
106
0
0
20 Aug 2025
Large Language Models Do Not Simulate Human Psychology
Large Language Models Do Not Simulate Human Psychology
Sarah Schröder
Thekla Morgenroth
Ulrike Kuhl
Valerie Vaquet
Benjamin Paaßen
148
7
0
09 Aug 2025
UAV-ON: A Benchmark for Open-World Object Goal Navigation with Aerial Agents
UAV-ON: A Benchmark for Open-World Object Goal Navigation with Aerial Agents
Jianqiang Xiao
Yuexuan Sun
Yixin Shao
Boxi Gan
Rongqiang Liu
Yanjing Wu
Weili Gua
Xiang Deng
248
9
0
01 Aug 2025
Do Large Language Models Have a Planning Theory of Mind? Evidence from MindGames: a Multi-Step Persuasion Task
Do Large Language Models Have a Planning Theory of Mind? Evidence from MindGames: a Multi-Step Persuasion Task
Jared Moore
Ned Cooper
Rasmus Overmark
Beba Cibralic
Nick Haber
Cameron R. Jones
LLMAGLRM
161
1
0
22 Jul 2025
Investigating VLM Hallucination from a Cognitive Psychology Perspective: A First Step Toward Interpretation with Intriguing Observations
Investigating VLM Hallucination from a Cognitive Psychology Perspective: A First Step Toward Interpretation with Intriguing Observations
Xiangrui Liu
Man Luo
Agneet Chatterjee
Hua Wei
Chitta Baral
Yezhou Yang
144
0
0
03 Jul 2025
Visual Structures Helps Visual Reasoning: Addressing the Binding Problem in VLMs
Visual Structures Helps Visual Reasoning: Addressing the Binding Problem in VLMs
Amirmohammad Izadi
Mohammad Ali Banayeeanzade
Fatemeh Askari
Ali Rahimiakbar
Mohammad Mahdi Vahedi
Hosein Hasani
M. Baghshah
LRM
198
1
0
27 Jun 2025
Language-Informed Synthesis of Rational Agent Models for Grounded Theory-of-Mind Reasoning On-The-Fly
Language-Informed Synthesis of Rational Agent Models for Grounded Theory-of-Mind Reasoning On-The-Fly
Lance Ying
Ryan Truong
Katherine M. Collins
Cedegao E. Zhang
Megan Wei
Tyler Brooke-Wilson
Tan Zhi-Xuan
Lionel Wong
J. Tenenbaum
LLMAG
144
5
0
20 Jun 2025
From Prompts to Constructs: A Dual-Validity Framework for LLM Research in Psychology
From Prompts to Constructs: A Dual-Validity Framework for LLM Research in Psychology
Zhicheng Lin
195
2
0
20 Jun 2025
PRISON: Unmasking the Criminal Potential of Large Language Models
PRISON: Unmasking the Criminal Potential of Large Language Models
Xinyi Wu
Geng Hong
Pei Chen
Yueyue Chen
Xudong Pan
Min Yang
238
1
0
19 Jun 2025
Can structural correspondences ground real world representational content in Large Language Models?
Can structural correspondences ground real world representational content in Large Language Models?
Iwan Williams
140
1
0
19 Jun 2025
Large Language Models are Near-Optimal Decision-Makers with a Non-Human Learning Behavior
Large Language Models are Near-Optimal Decision-Makers with a Non-Human Learning Behavior
Hao Li
Gengrui Zhang
Petter Holme
Shuyue Hu
Zhen Wang
170
1
0
19 Jun 2025
From Black Boxes to Transparent Minds: Evaluating and Enhancing the Theory of Mind in Multimodal Large Language Models
From Black Boxes to Transparent Minds: Evaluating and Enhancing the Theory of Mind in Multimodal Large Language Models
Xinyang Li
Siqi Liu
Bochao Zou
Jiansheng Chen
Huimin Ma
188
2
0
17 Jun 2025
Behavioral Generative Agents for Energy Operations
Behavioral Generative Agents for Energy Operations
Cong Chen
Omer Karaduman
Xu Kuang
AI4CE
62
0
0
14 Jun 2025
UniToMBench: Integrating Perspective-Taking to Improve Theory of Mind in LLMs
UniToMBench: Integrating Perspective-Taking to Improve Theory of Mind in LLMs
Prameshwar Thiyagarajan
Vaishnavi Parimi
Shamant Sai
Soumil Garg
Zhangir Meirbek
Nitin Yarlagadda
Kevin Zhu
Chris Kim
LLMAG
182
1
0
11 Jun 2025
LLM-D12: A Dual-Dimensional Scale of Instrumental and Relational Dependencies on Large Language Models
LLM-D12: A Dual-Dimensional Scale of Instrumental and Relational Dependencies on Large Language ModelsACM Transactions on the Web (TWEB), 2025
Ala Yankouskaya
Areej B. Babiker
Syeda W. F. Rizvi
Sameha Alshakhsi
Magnus Liebherr
Raian Ali
191
4
0
07 Jun 2025
Can Vision Language Models Infer Human Gaze Direction? A Controlled Study
Can Vision Language Models Infer Human Gaze Direction? A Controlled Study
Zory Zhang
Pinyuan Feng
Bingyang Wang
Tianwei Zhao
Suyang Yu
Qingying Gao
Hokin Deng
Ziqiao Ma
Yijiang Li
Dezhi Luo
213
2
0
04 Jun 2025
Representations of Fact, Fiction and Forecast in Large Language Models: Epistemics and Attitudes
Representations of Fact, Fiction and Forecast in Large Language Models: Epistemics and AttitudesAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Meng Li
Michael Vrazitulis
David Schlangen
225
0
0
02 Jun 2025
Effects of Theory of Mind and Prosocial Beliefs on Steering Human-Aligned Behaviors of LLMs in Ultimatum Games
Effects of Theory of Mind and Prosocial Beliefs on Steering Human-Aligned Behaviors of LLMs in Ultimatum Games
Neemesh Yadav
Palakorn Achananuparp
Jing Jiang
Ee-Peng Lim
LRM
104
0
0
30 May 2025
ValueSim: Generating Backstories to Model Individual Value Systems
ValueSim: Generating Backstories to Model Individual Value Systems
Bangde Du
Ziyi Ye
Zhijing Wu
Jankowska Monika
Shuqi Zhu
Jiaxin Mao
Yujia Zhou
Yiqun Liu
193
1
0
28 May 2025
Large Language Models Miss the Multi-Agent Mark
Large Language Models Miss the Multi-Agent Mark
Emanuele La Malfa
Gabriele La Malfa
Samuele Marro
Jie M. Zhang
Elizabeth Black
Micheal Luck
Juil Sock
Michael Wooldridge
LLMAG
303
0
0
27 May 2025
The Pragmatic Mind of Machines: Tracing the Emergence of Pragmatic Competence in Large Language Models
The Pragmatic Mind of Machines: Tracing the Emergence of Pragmatic Competence in Large Language Models
Kefan Yu
Qingcheng Zeng
Weihao Xuan
Wanxin Li
Jingyi Wu
Rob Voigt
ReLMLRM
283
0
0
24 May 2025
Multi-Party Conversational Agents: A Survey
Multi-Party Conversational Agents: A Survey
Sagar Sapkota
M. Hasan
Mubarak Shah
Santu Karmaker
LLMAG
256
0
0
24 May 2025
DEL-ToM: Inference-Time Scaling for Theory-of-Mind Reasoning via Dynamic Epistemic Logic
DEL-ToM: Inference-Time Scaling for Theory-of-Mind Reasoning via Dynamic Epistemic Logic
Yuheng Wu
Jianwen Xie
Denghui Zhang
Zhaozhuo Xu
LRMVLM
192
2
0
22 May 2025
Language Models use Lookbacks to Track Beliefs
Language Models use Lookbacks to Track Beliefs
Nikhil Prakash
Natalie Shapira
Arnab Sen Sharma
Christoph Riedl
Yonatan Belinkov
Tamar Rott Shaham
David Bau
Atticus Geiger
KELM
291
9
0
20 May 2025
Adversarial Testing in LLMs: Insights into Decision-Making Vulnerabilities
Adversarial Testing in LLMs: Insights into Decision-Making Vulnerabilities
Lili Zhang
Haomiaomiao Wang
Long Cheng
Libao Deng
Tomas E. Ward
AAML
335
0
0
19 May 2025
PsyMem: Fine-grained psychological alignment and Explicit Memory Control for Advanced Role-Playing LLMs
PsyMem: Fine-grained psychological alignment and Explicit Memory Control for Advanced Role-Playing LLMs
Xilong Cheng
Yunxiao Qin
Yuting Tan
Zhengnan Li
Ye Wang
Hongjiang Xiao
Yuan Zhang
337
0
0
19 May 2025
BeliefNest: A Joint Action Simulator for Embodied Agents with Theory of Mind
BeliefNest: A Joint Action Simulator for Embodied Agents with Theory of Mind
Rikunari Sagara
Koichiro Terao
Naoto Iwahashi
LM&Ro
395
0
0
18 May 2025
AI-enhanced semantic feature norms for 786 concepts
AI-enhanced semantic feature norms for 786 concepts
Siddharth Suresh
Kushin Mukherjee
Tyler Giallanza
Xizheng Yu
Mia Patil
Jonathan Cohen
Timothy T. Rogers
184
1
0
15 May 2025
Empirically evaluating commonsense intelligence in large language models with large-scale human judgments
Empirically evaluating commonsense intelligence in large language models with large-scale human judgments
Tuan Dung Nguyen
Duncan J. Watts
Mark E. Whiting
ELM
363
3
0
15 May 2025
Beyond Recognition: Evaluating Visual Perspective Taking in Vision Language Models
Beyond Recognition: Evaluating Visual Perspective Taking in Vision Language Models
Gracjan Góral
Alicja Ziarko
Piotr Miłoś
Michał Nauman
Maciej Wołczyk
Michał Kosiński
LRM
274
0
0
03 May 2025
The Convergent Ethics of AI? Analyzing Moral Foundation Priorities in Large Language Models with a Multi-Framework Approach
The Convergent Ethics of AI? Analyzing Moral Foundation Priorities in Large Language Models with a Multi-Framework Approach
Chad Coleman
W. Russell Neuman
Ali Dasdan
Safinah Ali
Manan Shah
ELMLRM
212
2
0
27 Apr 2025
AI Awareness
AI Awareness
Xianrui Li
Haoyuan Shi
Rongwu Xu
Wei Xu
438
3
0
25 Apr 2025
123
Next