v1v2v3v4v5v6 (latest)

Evaluating Large Language Models in Theory of Mind Tasks

Proceedings of the National Academy of Sciences of the United States of America (PNAS), 2023

4 February 2023

Michal Kosinskihttps://www.semanticscholar.org/me/account

LLMAG

LRM

ArXiv (abs)PDF HTML

Papers citing "Evaluating Large Language Models in Theory of Mind Tasks"

50 / 108 papers shown

Title
LLM Social Simulations Are a Promising Research Method Jacy Reese Anthis Ryan Liu Sean M. Richardson Austin C. Kozlowski Bernard Koch James A. Evans Erik Brynjolfsson Michael S. Bernstein ALM 439 79 0 03 Apr 2025
Do Theory of Mind Benchmarks Need Explicit Human-like Reasoning in Language Models? Yi-Long Lu Chunhui Zhang Jiajun Song Lifeng Fan Wei Wang OffRL 206 0 0 02 Apr 2025
Trapped by Expectations: Functional Fixedness in LLM-Enabled Chat Search Jiqun Liu Jamshed Karimnazarov Ryen W. White 136 3 0 02 Apr 2025
The Mind in the Machine: A Survey of Incorporating Psychological Theories in LLMs Zizhou Liu Ziwei Gong Lin Ai Zheng Hui Run Chen Colin Wayne Leach Michelle R. Greene Julia Hirschberg LLMAG 917 5 0 28 Mar 2025
Gricean Norms as a Basis for Effective CollaborationAdaptive Agents and Multi-Agent Systems (AAMAS), 2025 Fardin Saad Pradeep K. Murukannaiah Munindar P. Singh 877 1 0 18 Mar 2025
MetaScale: Test-Time Scaling with Evolving Meta-Thoughts Qin Liu Wenxuan Zhou Nan Xu James Y. Huang Haiwei Yang Sheng Zhang Hoifung Poon Mengzhao Chen LLMAG ReLM AI4Cl LRM 291 8 0 17 Mar 2025
Towards properly implementing Theory of Mind in AI systems: An account of four misconceptions Ramira van der Meulen Rineke Verbrugge Max van Duijn 166 0 0 28 Feb 2025
Re-evaluating Theory of Mind evaluation in large language modelsPhilosophical transactions of the Royal Society of London. Series B, Biological sciences (Philos Trans R Soc Lond B Biol Sci), 2025 Jennifer Hu Felix Sosa T. Ullman 341 8 0 28 Feb 2025
On Benchmarking Human-Like Intelligence in Machines Lance Ying Katherine M. Collins L. Wong Ilia Sucholutsky Ryan Liu Adrian Weller Tianmin Shu Thomas Griffiths Joshua B. Tenenbaum ALM ELM 850 19 0 27 Feb 2025
AgentSociety: Large-Scale Simulation of LLM-Driven Generative Agents Advances Understanding of Human Behaviors and Society J. Piao Yuwei Yan Jun Zhang Nian Li Junbo Yan ... Fengli Xu Fang Zhang Ke Rong Jun Su Yongqian Li AI4CE 464 82 0 12 Feb 2025
Mind Your Theory: Theory of Mind Goes Deeper Than ReasoningAnnual Meeting of the Association for Computational Linguistics (ACL), 2024 Eitan Wagner Nitay Alon J. Barnby Omri Abend LRM 429 7 0 18 Dec 2024
Towards Full Delegation: Designing Ideal Agentic Behaviors for Travel Planning Song Jiang Da JU Andrew Cohen Sasha Mitts Aaron Foss Justine T Kao Xian Li Yuandong Tian 316 5 0 21 Nov 2024
Advancements and limitations of LLMs in replicating human color-word associationsDiscover Artificial Intelligence (Discover AI), 2024 Makoto Fukushima Shusuke Eshita Hiroshige Fukuhara 282 1 0 04 Nov 2024
RSA-Control: A Pragmatics-Grounded Lightweight Controllable Text Generation FrameworkConference on Empirical Methods in Natural Language Processing (EMNLP), 2024 Yifan Wang Vera Demberg 170 7 0 24 Oct 2024
Chatting with Bots: AI, Speech Acts, and the Edge of Assertion Iwan Williams Tim Bayne 199 6 0 22 Oct 2024
SimpleToM: Exposing the Gap between Explicit ToM Inference and Implicit ToM Application in LLMs Yuling Gu Oyvind Tafjord Hyunwoo Kim Jared Moore Ronan Le Bras Peter Clark Yejin Choi 252 25 0 17 Oct 2024
DocKD: Knowledge Distillation from LLMs for Open-World Document Understanding ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2024 Sungnyun Kim Haofu Liao Srikar Appalaraju Peng Tang Zhuowen Tu R. Satzoda R. Manmatha Vijay Mahadevan Stefano Soatto 252 2 0 04 Oct 2024
Large Model Strategic Thinking, Small Model Efficiency: Transferring Theory of Mind in Large Language Models Nunzio Lorè Alireza Ilami Babak Heydari LRM 335 4 0 05 Aug 2024
Hypothetical Minds: Scaffolding Theory of Mind for Multi-Agent Tasks with Large Language Models Logan Cross Robert Z. Sparks Agam Bhatia Daniel L. K. Yamins Nick Haber LM&Ro LRM LLMAG 230 19 0 09 Jul 2024
LTLBench: Towards Benchmarks for Evaluating Temporal Logic Reasoning in Large Language Models Weizhi Tang Vaishak Belle LRM 167 2 0 07 Jul 2024
Over the Edge of Chaos? Excess Complexity as a Roadblock to Artificial General Intelligence Teo Susnjak Timothy R. McIntosh A. Barczak N. Reyes Tong Liu Paul Watters Malka N. Halgamuge 168 4 0 04 Jul 2024
Cactus: Towards Psychological Counseling Conversations using Cognitive Behavioral Theory Suyeon Lee Sunghwan Kim Minju Kim Dongjin Kang Dongil Yang ... Seungbeen Lee Kyoung-Mee Chung Youngjae Yu Dongha Lee Jinyoung Yeo 169 30 0 03 Jul 2024
Self-Cognition in Large Language Models: An Exploratory Study Dongping Chen Jiawen Shi Yao Wan Pan Zhou Neil Zhenqiang Gong Lichao Sun LRM LLMAG 195 9 0 01 Jul 2024
Towards a Science Exocortex Kevin G. Yager 290 5 0 24 Jun 2024
Large Language Models Assume People are More Rational than We Really are Ryan Liu Jiayi Geng Joshua C. Peterson Ilia Sucholutsky Thomas Griffiths 460 34 0 24 Jun 2024
Dissecting the Ullman Variations with a SCALPEL: Why do LLMs fail at Trivial Alterations to the False Belief Task? Zhiqiang Pi Annapurna Vadaparty Benjamin Bergen Cameron R. Jones 254 4 0 20 Jun 2024
Through the Theory of Mind's Eye: Reading Minds with Multimodal Video Large Language Models Z. Chen Tianchun Wang Yizhou Wang Michal Kosinski Xiang Zhang Yun Fu Sheng Li LRM 202 6 0 19 Jun 2024
Is persona enough for personality? Using ChatGPT to reconstruct an agent's latent personality from simple descriptions Yongyi Ji Zhisheng Tang Mayank Kejriwal 211 7 0 18 Jun 2024
Tracking the perspectives of interacting language models Hayden Helm Brandon Duderstadt Youngser Park Carey E. Priebe 267 10 0 17 Jun 2024
Grammaticality Representation in ChatGPT as Compared to Linguists and Laypeople Zhuang Qiu Xufeng Duan Zhenguang G. Cai 141 6 0 17 Jun 2024
The Potential and Challenges of Evaluating Attitudes, Opinions, and Values in Large Language Models Bolei Ma Xinpeng Wang Tiancheng Hu Anna Haensch Michael A. Hedderich Barbara Plank Frauke Kreuter ALM 210 16 0 16 Jun 2024
A Peek into Token Bias: Large Language Models Are Not Yet Genuine Reasoners Bowen Jiang Yangxinyu Xie Zhuoqun Hao Xiaomeng Wang Tanwi Mallick Weijie J. Su Camillo J Taylor Dan Roth LRM 261 85 0 16 Jun 2024
Ollabench: Evaluating LLMs' Reasoning for Human-centric Interdependent Cybersecurity Tam n. Nguyen ELM 176 4 0 11 Jun 2024
Zero, Finite, and Infinite Belief History of Theory of Mind Reasoning in Large Language Models Weizhi Tang Vaishak Belle LLMAG LRM 159 1 0 07 Jun 2024
Towards Rationality in Language and Multimodal Agents: A Survey Bowen Jiang Yangxinyu Xie Xiaomeng Wang Yuan Yuan Camillo J Taylor Tanwi Mallick Weijie J. Su Camillo J. Taylor Tanwi Mallick LLMAG 254 4 0 01 Jun 2024
Elements of World Knowledge (EWoK): A Cognition-Inspired Framework for Evaluating Basic World Knowledge in Language ModelsTransactions of the Association for Computational Linguistics (TACL), 2024 Anna A. Ivanova Aalok Sathe Benjamin Lipkin Unnathi Kumar S. Radkani ... Leshem Choshen Roger Levy Evelina Fedorenko Josh Tenenbaum Jacob Andreas 241 52 0 15 May 2024
LLM-Generated Black-box Explanations Can Be Adversarially Helpful R. Ajwani Shashidhar Reddy Javaji Frank Rudzicz Zining Zhu AAML 243 22 0 10 May 2024
ToM-LM: Delegating Theory of Mind Reasoning to External Symbolic Executors in Large Language Models Weizhi Tang Vaishak Belle LRM LLMAG 199 1 0 23 Apr 2024
Language Models as Critical Thinking Tools: A Case Study of Philosophers Andre Ye Jared Moore Rose Novick Amy X. Zhang KELM ELM LRM LLMAG 162 10 0 06 Apr 2024
Distributed agency in second language learning and teaching through generative AI Robert Godwin-Jones 163 48 0 29 Mar 2024
How Far Are We on the Decision-Making of LLMs? Evaluating LLMs' Gaming Ability in Multi-Agent Environments Shu Yang E. Li Man Ho Lam Tian Liang Wenxuan Wang Youliang Yuan Wenxiang Jiao Xing Wang Zhaopeng Tu Michael R. Lyu ELM LLMAG 474 52 0 18 Mar 2024
Shall We Team Up: Exploring Spontaneous Cooperation of Competing LLM Agents Zengqing Wu Run Peng Shuyuan Zheng Qianying Liu Xu Han Brian Inhyuk Kwon Makoto Onizuka Shaojie Tang Chuan Xiao 239 30 0 19 Feb 2024
Can Generative Agents Predict Emotion? Ciaran Regan Nanami Iwahashi Shogo Tanaka Mizuki Oka 136 1 0 06 Feb 2024
What should I say? -- Interacting with AI and Natural Language Interfaces Mark Adkins 158 1 0 12 Jan 2024
Exploring the Frontiers of LLMs in Psychological Applications: A Comprehensive ReviewArtificial Intelligence Review (Artif Intell Rev), 2024 Luoma Ke Song Tong Peng Cheng Kaiping Peng OffRL LM&MA 547 37 0 03 Jan 2024
The Tyranny of Possibilities in the Design of Task-Oriented LLM Systems: A Scoping Survey Dhruv Dhamani Mary Lou Maher 184 1 0 29 Dec 2023
Leveraging Word Guessing Games to Assess the Intelligence of Large Language Models Tian Liang Zhiwei He Shu Yang Wenxuan Wang Wenxiang Jiao Rui Wang Yujiu Yang Zhaopeng Tu Shuming Shi Xing Wang LLMAG 228 8 0 31 Oct 2023
Generative Language Models Exhibit Social Identity BiasesNature Computational Science (Nat. Comput. Sci.), 2023 Tiancheng Hu Yara Kyrychenko Steve Rathje Nigel Collier S. V. D. Linden Jon Roozenbeek 262 107 0 24 Oct 2023
The Cultural Psychology of Large Language Models: Is ChatGPT a Holistic or Analytic Thinker? Chuanyang Jin Songyang Zhang Tianmin Shu Zhihan Cui LLMAG AI4MH 126 7 0 28 Aug 2023
Playing repeated games with Large Language ModelsNature Human Behaviour (Nat Hum Behav), 2023 Elif Akata Lion Schulz Julian Coda-Forno Seong Joon Oh Matthias Bethge Eric Schulz 1.1K 184 0 26 May 2023