ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2302.02083
  4. Cited By
Evaluating Large Language Models in Theory of Mind Tasks
v1v2v3v4v5v6 (latest)

Evaluating Large Language Models in Theory of Mind Tasks

Proceedings of the National Academy of Sciences of the United States of America (PNAS), 2023
4 February 2023
Michal Kosinskihttps://www.semanticscholar.org/me/account
    LLMAGLRM
ArXiv (abs)PDFHTML

Papers citing "Evaluating Large Language Models in Theory of Mind Tasks"

50 / 108 papers shown
Title
LLM Social Simulations Are a Promising Research Method
LLM Social Simulations Are a Promising Research Method
Jacy Reese Anthis
Ryan Liu
Sean M. Richardson
Austin C. Kozlowski
Bernard Koch
James A. Evans
Erik Brynjolfsson
Michael S. Bernstein
ALM
439
79
0
03 Apr 2025
Do Theory of Mind Benchmarks Need Explicit Human-like Reasoning in Language Models?
Do Theory of Mind Benchmarks Need Explicit Human-like Reasoning in Language Models?
Yi-Long Lu
Chunhui Zhang
Jiajun Song
Lifeng Fan
Wei Wang
OffRL
206
0
0
02 Apr 2025
Trapped by Expectations: Functional Fixedness in LLM-Enabled Chat Search
Trapped by Expectations: Functional Fixedness in LLM-Enabled Chat Search
Jiqun Liu
Jamshed Karimnazarov
Ryen W. White
136
3
0
02 Apr 2025
The Mind in the Machine: A Survey of Incorporating Psychological Theories in LLMs
The Mind in the Machine: A Survey of Incorporating Psychological Theories in LLMs
Zizhou Liu
Ziwei Gong
Lin Ai
Zheng Hui
Run Chen
Colin Wayne Leach
Michelle R. Greene
Julia Hirschberg
LLMAG
917
5
0
28 Mar 2025
Gricean Norms as a Basis for Effective Collaboration
Gricean Norms as a Basis for Effective CollaborationAdaptive Agents and Multi-Agent Systems (AAMAS), 2025
Fardin Saad
Pradeep K. Murukannaiah
Munindar P. Singh
877
1
0
18 Mar 2025
MetaScale: Test-Time Scaling with Evolving Meta-Thoughts
MetaScale: Test-Time Scaling with Evolving Meta-Thoughts
Qin Liu
Wenxuan Zhou
Nan Xu
James Y. Huang
Haiwei Yang
Sheng Zhang
Hoifung Poon
Mengzhao Chen
LLMAGReLMAI4ClLRM
291
8
0
17 Mar 2025
Towards properly implementing Theory of Mind in AI systems: An account of four misconceptions
Towards properly implementing Theory of Mind in AI systems: An account of four misconceptions
Ramira van der Meulen
Rineke Verbrugge
Max van Duijn
166
0
0
28 Feb 2025
Re-evaluating Theory of Mind evaluation in large language models
Re-evaluating Theory of Mind evaluation in large language modelsPhilosophical transactions of the Royal Society of London. Series B, Biological sciences (Philos Trans R Soc Lond B Biol Sci), 2025
Jennifer Hu
Felix Sosa
T. Ullman
341
8
0
28 Feb 2025
On Benchmarking Human-Like Intelligence in Machines
On Benchmarking Human-Like Intelligence in Machines
Lance Ying
Katherine M. Collins
L. Wong
Ilia Sucholutsky
Ryan Liu
Adrian Weller
Tianmin Shu
Thomas Griffiths
Joshua B. Tenenbaum
ALMELM
850
19
0
27 Feb 2025
AgentSociety: Large-Scale Simulation of LLM-Driven Generative Agents Advances Understanding of Human Behaviors and Society
AgentSociety: Large-Scale Simulation of LLM-Driven Generative Agents Advances Understanding of Human Behaviors and Society
J. Piao
Yuwei Yan
Jun Zhang
Nian Li
Junbo Yan
...
Fengli Xu
Fang Zhang
Ke Rong
Jun Su
Yongqian Li
AI4CE
464
82
0
12 Feb 2025
Mind Your Theory: Theory of Mind Goes Deeper Than Reasoning
Mind Your Theory: Theory of Mind Goes Deeper Than ReasoningAnnual Meeting of the Association for Computational Linguistics (ACL), 2024
Eitan Wagner
Nitay Alon
J. Barnby
Omri Abend
LRM
429
7
0
18 Dec 2024
Towards Full Delegation: Designing Ideal Agentic Behaviors for Travel
  Planning
Towards Full Delegation: Designing Ideal Agentic Behaviors for Travel Planning
Song Jiang
Da JU
Andrew Cohen
Sasha Mitts
Aaron Foss
Justine T Kao
Xian Li
Yuandong Tian
316
5
0
21 Nov 2024
Advancements and limitations of LLMs in replicating human color-word associations
Advancements and limitations of LLMs in replicating human color-word associationsDiscover Artificial Intelligence (Discover AI), 2024
Makoto Fukushima
Shusuke Eshita
Hiroshige Fukuhara
282
1
0
04 Nov 2024
RSA-Control: A Pragmatics-Grounded Lightweight Controllable Text
  Generation Framework
RSA-Control: A Pragmatics-Grounded Lightweight Controllable Text Generation FrameworkConference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Yifan Wang
Vera Demberg
170
7
0
24 Oct 2024
Chatting with Bots: AI, Speech Acts, and the Edge of Assertion
Chatting with Bots: AI, Speech Acts, and the Edge of Assertion
Iwan Williams
Tim Bayne
199
6
0
22 Oct 2024
SimpleToM: Exposing the Gap between Explicit ToM Inference and Implicit
  ToM Application in LLMs
SimpleToM: Exposing the Gap between Explicit ToM Inference and Implicit ToM Application in LLMs
Yuling Gu
Oyvind Tafjord
Hyunwoo Kim
Jared Moore
Ronan Le Bras
Peter Clark
Yejin Choi
252
25
0
17 Oct 2024
DocKD: Knowledge Distillation from LLMs for Open-World Document
  Understanding Models
DocKD: Knowledge Distillation from LLMs for Open-World Document Understanding ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Sungnyun Kim
Haofu Liao
Srikar Appalaraju
Peng Tang
Zhuowen Tu
R. Satzoda
R. Manmatha
Vijay Mahadevan
Stefano Soatto
252
2
0
04 Oct 2024
Large Model Strategic Thinking, Small Model Efficiency: Transferring
  Theory of Mind in Large Language Models
Large Model Strategic Thinking, Small Model Efficiency: Transferring Theory of Mind in Large Language Models
Nunzio Lorè
Alireza Ilami
Babak Heydari
LRM
335
4
0
05 Aug 2024
Hypothetical Minds: Scaffolding Theory of Mind for Multi-Agent Tasks
  with Large Language Models
Hypothetical Minds: Scaffolding Theory of Mind for Multi-Agent Tasks with Large Language Models
Logan Cross
Robert Z. Sparks
Agam Bhatia
Daniel L. K. Yamins
Nick Haber
LM&RoLRMLLMAG
230
19
0
09 Jul 2024
LTLBench: Towards Benchmarks for Evaluating Temporal Logic Reasoning in
  Large Language Models
LTLBench: Towards Benchmarks for Evaluating Temporal Logic Reasoning in Large Language Models
Weizhi Tang
Vaishak Belle
LRM
167
2
0
07 Jul 2024
Over the Edge of Chaos? Excess Complexity as a Roadblock to Artificial
  General Intelligence
Over the Edge of Chaos? Excess Complexity as a Roadblock to Artificial General Intelligence
Teo Susnjak
Timothy R. McIntosh
A. Barczak
N. Reyes
Tong Liu
Paul Watters
Malka N. Halgamuge
168
4
0
04 Jul 2024
Cactus: Towards Psychological Counseling Conversations using Cognitive
  Behavioral Theory
Cactus: Towards Psychological Counseling Conversations using Cognitive Behavioral Theory
Suyeon Lee
Sunghwan Kim
Minju Kim
Dongjin Kang
Dongil Yang
...
Seungbeen Lee
Kyoung-Mee Chung
Youngjae Yu
Dongha Lee
Jinyoung Yeo
169
30
0
03 Jul 2024
Self-Cognition in Large Language Models: An Exploratory Study
Self-Cognition in Large Language Models: An Exploratory Study
Dongping Chen
Jiawen Shi
Yao Wan
Pan Zhou
Neil Zhenqiang Gong
Lichao Sun
LRMLLMAG
195
9
0
01 Jul 2024
Towards a Science Exocortex
Towards a Science Exocortex
Kevin G. Yager
290
5
0
24 Jun 2024
Large Language Models Assume People are More Rational than We Really are
Large Language Models Assume People are More Rational than We Really are
Ryan Liu
Jiayi Geng
Joshua C. Peterson
Ilia Sucholutsky
Thomas Griffiths
460
34
0
24 Jun 2024
Dissecting the Ullman Variations with a SCALPEL: Why do LLMs fail at Trivial Alterations to the False Belief Task?
Dissecting the Ullman Variations with a SCALPEL: Why do LLMs fail at Trivial Alterations to the False Belief Task?
Zhiqiang Pi
Annapurna Vadaparty
Benjamin Bergen
Cameron R. Jones
254
4
0
20 Jun 2024
Through the Theory of Mind's Eye: Reading Minds with Multimodal Video Large Language Models
Through the Theory of Mind's Eye: Reading Minds with Multimodal Video Large Language Models
Z. Chen
Tianchun Wang
Yizhou Wang
Michal Kosinski
Xiang Zhang
Yun Fu
Sheng Li
LRM
202
6
0
19 Jun 2024
Is persona enough for personality? Using ChatGPT to reconstruct an
  agent's latent personality from simple descriptions
Is persona enough for personality? Using ChatGPT to reconstruct an agent's latent personality from simple descriptions
Yongyi Ji
Zhisheng Tang
Mayank Kejriwal
211
7
0
18 Jun 2024
Tracking the perspectives of interacting language models
Tracking the perspectives of interacting language models
Hayden Helm
Brandon Duderstadt
Youngser Park
Carey E. Priebe
267
10
0
17 Jun 2024
Grammaticality Representation in ChatGPT as Compared to Linguists and
  Laypeople
Grammaticality Representation in ChatGPT as Compared to Linguists and Laypeople
Zhuang Qiu
Xufeng Duan
Zhenguang G. Cai
141
6
0
17 Jun 2024
The Potential and Challenges of Evaluating Attitudes, Opinions, and
  Values in Large Language Models
The Potential and Challenges of Evaluating Attitudes, Opinions, and Values in Large Language Models
Bolei Ma
Xinpeng Wang
Tiancheng Hu
Anna Haensch
Michael A. Hedderich
Barbara Plank
Frauke Kreuter
ALM
210
16
0
16 Jun 2024
A Peek into Token Bias: Large Language Models Are Not Yet Genuine
  Reasoners
A Peek into Token Bias: Large Language Models Are Not Yet Genuine Reasoners
Bowen Jiang
Yangxinyu Xie
Zhuoqun Hao
Xiaomeng Wang
Tanwi Mallick
Weijie J. Su
Camillo J Taylor
Dan Roth
LRM
261
85
0
16 Jun 2024
Ollabench: Evaluating LLMs' Reasoning for Human-centric Interdependent
  Cybersecurity
Ollabench: Evaluating LLMs' Reasoning for Human-centric Interdependent Cybersecurity
Tam n. Nguyen
ELM
176
4
0
11 Jun 2024
Zero, Finite, and Infinite Belief History of Theory of Mind Reasoning in
  Large Language Models
Zero, Finite, and Infinite Belief History of Theory of Mind Reasoning in Large Language Models
Weizhi Tang
Vaishak Belle
LLMAGLRM
159
1
0
07 Jun 2024
Towards Rationality in Language and Multimodal Agents: A Survey
Towards Rationality in Language and Multimodal Agents: A Survey
Bowen Jiang
Yangxinyu Xie
Xiaomeng Wang
Yuan Yuan
Camillo J Taylor
Tanwi Mallick
Weijie J. Su
Camillo J. Taylor
Tanwi Mallick
LLMAG
254
4
0
01 Jun 2024
Elements of World Knowledge (EWoK): A Cognition-Inspired Framework for Evaluating Basic World Knowledge in Language Models
Elements of World Knowledge (EWoK): A Cognition-Inspired Framework for Evaluating Basic World Knowledge in Language ModelsTransactions of the Association for Computational Linguistics (TACL), 2024
Anna A. Ivanova
Aalok Sathe
Benjamin Lipkin
Unnathi Kumar
S. Radkani
...
Leshem Choshen
Roger Levy
Evelina Fedorenko
Josh Tenenbaum
Jacob Andreas
241
52
0
15 May 2024
LLM-Generated Black-box Explanations Can Be Adversarially Helpful
LLM-Generated Black-box Explanations Can Be Adversarially Helpful
R. Ajwani
Shashidhar Reddy Javaji
Frank Rudzicz
Zining Zhu
AAML
243
22
0
10 May 2024
ToM-LM: Delegating Theory of Mind Reasoning to External Symbolic
  Executors in Large Language Models
ToM-LM: Delegating Theory of Mind Reasoning to External Symbolic Executors in Large Language Models
Weizhi Tang
Vaishak Belle
LRMLLMAG
199
1
0
23 Apr 2024
Language Models as Critical Thinking Tools: A Case Study of Philosophers
Language Models as Critical Thinking Tools: A Case Study of Philosophers
Andre Ye
Jared Moore
Rose Novick
Amy X. Zhang
KELMELMLRMLLMAG
162
10
0
06 Apr 2024
Distributed agency in second language learning and teaching through
  generative AI
Distributed agency in second language learning and teaching through generative AI
Robert Godwin-Jones
163
48
0
29 Mar 2024
How Far Are We on the Decision-Making of LLMs? Evaluating LLMs' Gaming Ability in Multi-Agent Environments
How Far Are We on the Decision-Making of LLMs? Evaluating LLMs' Gaming Ability in Multi-Agent Environments
Shu Yang
E. Li
Man Ho Lam
Tian Liang
Wenxuan Wang
Youliang Yuan
Wenxiang Jiao
Xing Wang
Zhaopeng Tu
Michael R. Lyu
ELMLLMAG
474
52
0
18 Mar 2024
Shall We Team Up: Exploring Spontaneous Cooperation of Competing LLM
  Agents
Shall We Team Up: Exploring Spontaneous Cooperation of Competing LLM Agents
Zengqing Wu
Run Peng
Shuyuan Zheng
Qianying Liu
Xu Han
Brian Inhyuk Kwon
Makoto Onizuka
Shaojie Tang
Chuan Xiao
239
30
0
19 Feb 2024
Can Generative Agents Predict Emotion?
Can Generative Agents Predict Emotion?
Ciaran Regan
Nanami Iwahashi
Shogo Tanaka
Mizuki Oka
136
1
0
06 Feb 2024
What should I say? -- Interacting with AI and Natural Language
  Interfaces
What should I say? -- Interacting with AI and Natural Language Interfaces
Mark Adkins
158
1
0
12 Jan 2024
Exploring the Frontiers of LLMs in Psychological Applications: A Comprehensive Review
Exploring the Frontiers of LLMs in Psychological Applications: A Comprehensive ReviewArtificial Intelligence Review (Artif Intell Rev), 2024
Luoma Ke
Song Tong
Peng Cheng
Kaiping Peng
OffRLLM&MA
547
37
0
03 Jan 2024
The Tyranny of Possibilities in the Design of Task-Oriented LLM Systems:
  A Scoping Survey
The Tyranny of Possibilities in the Design of Task-Oriented LLM Systems: A Scoping Survey
Dhruv Dhamani
Mary Lou Maher
184
1
0
29 Dec 2023
Leveraging Word Guessing Games to Assess the Intelligence of Large
  Language Models
Leveraging Word Guessing Games to Assess the Intelligence of Large Language Models
Tian Liang
Zhiwei He
Shu Yang
Wenxuan Wang
Wenxiang Jiao
Rui Wang
Yujiu Yang
Zhaopeng Tu
Shuming Shi
Xing Wang
LLMAG
228
8
0
31 Oct 2023
Generative Language Models Exhibit Social Identity Biases
Generative Language Models Exhibit Social Identity BiasesNature Computational Science (Nat. Comput. Sci.), 2023
Tiancheng Hu
Yara Kyrychenko
Steve Rathje
Nigel Collier
S. V. D. Linden
Jon Roozenbeek
262
107
0
24 Oct 2023
The Cultural Psychology of Large Language Models: Is ChatGPT a Holistic
  or Analytic Thinker?
The Cultural Psychology of Large Language Models: Is ChatGPT a Holistic or Analytic Thinker?
Chuanyang Jin
Songyang Zhang
Tianmin Shu
Zhihan Cui
LLMAGAI4MH
126
7
0
28 Aug 2023
Playing repeated games with Large Language Models
Playing repeated games with Large Language ModelsNature Human Behaviour (Nat Hum Behav), 2023
Elif Akata
Lion Schulz
Julian Coda-Forno
Seong Joon Oh
Matthias Bethge
Eric Schulz
1.1K
184
0
26 May 2023
Previous
123
Next