Large Language Models Fail on Trivial Alterations to Theory-of-Mind Tasks

16 February 2023

Papers citing "Large Language Models Fail on Trivial Alterations to Theory-of-Mind Tasks"

50 / 100 papers shown

Title
MuMA-ToM: Multi-modal Multi-Agent Theory of MindAAAI Conference on Artificial Intelligence (AAAI), 2024 Haojun Shi Suyu Ye Xinyu Fang Chuanyang Jin Leyla Isik Yen-Ling Kuo Tianmin Shu LLMAG 393 32 0 22 Aug 2024
Large Language Model Recall Uncertainty is Modulated by the Fan Effect Jesse Roberts Kyle Moore Thao Pham Oseremhen Ewaleifoh Doug Fisher 270 6 0 08 Jul 2024
TimeToM: Temporal Space is the Key to Unlocking the Door of Large Language Models' Theory-of-Mind Guiyang Hou Wenqi Zhang Yongliang Shen Linjuan Wu Weiming Lu LRM AI4CE 176 19 0 01 Jul 2024
Large Language Models Assume People are More Rational than We Really are Ryan Liu Jiayi Geng Joshua C. Peterson Ilia Sucholutsky Thomas Griffiths 480 35 0 24 Jun 2024
Dissecting the Ullman Variations with a SCALPEL: Why do LLMs fail at Trivial Alterations to the False Belief Task? Zhiqiang Pi Annapurna Vadaparty Benjamin Bergen Cameron R. Jones 274 4 0 20 Jun 2024
What is the Visual Cognition Gap between Humans and Multimodal LLMs? Xu Cao Yifan Shen Bolin Lai Wenqian Ye Yunsheng Ma ... Jintai Chen Meihuan Huang Jianguo Cao Aidong Zhang James M. Rehg 323 20 0 14 Jun 2024
GPT-ology, Computational Models, Silicon Sampling: How should we think about LLMs in Cognitive Science? Desmond C. Ong 283 5 0 13 Jun 2024
A social path to human-like artificial intelligence Edgar A. Duénez-Guzmán Suzanne Sadedin Jane X. Wang Kevin R. McKee Joel Z Leibo GNN 295 36 0 22 May 2024
Evaluating and Modeling Social Intelligence: A Comparative Study of Human and AI Capabilities Junqi Wang Chunhui Zhang Jiapeng Li Yuxi Ma Lixing Niu Jiaheng Han Yujia Peng Yixin Zhu Lifeng Fan ELM ALM 180 9 0 20 May 2024
Elements of World Knowledge (EWoK): A Cognition-Inspired Framework for Evaluating Basic World Knowledge in Language ModelsTransactions of the Association for Computational Linguistics (TACL), 2024 Anna A. Ivanova Aalok Sathe Benjamin Lipkin Unnathi Kumar S. Radkani ... Leshem Choshen Roger Levy Evelina Fedorenko Josh Tenenbaum Jacob Andreas 281 54 0 15 May 2024
From Perils to Possibilities: Understanding how Human (and AI) Biases affect Online Fora Virginia Morini Valentina Pansanella Katherine Abramski Erica Cau Andrea Failla Salvatore Citraro Giulio Rossetti 172 1 0 21 Mar 2024
GOMA: Proactive Embodied Cooperative Communication via Goal-Oriented Mental Alignment Lance Ying Kunal Jha Shivam Aarya Joshua B. Tenenbaum Antonio Torralba Tianmin Shu 234 19 0 17 Mar 2024
Views Are My Own, but Also Yours: Benchmarking Theory of Mind Using Common Ground Adil Soubki John Murzaku Arash Yousefi Jordehi Peter Zeng Magdalena Markowska Seyed Abolghasem Mirroshandel Owen Rambow VLM 216 10 0 04 Mar 2024
OpenToM: A Comprehensive Benchmark for Evaluating Theory-of-Mind Reasoning Capabilities of Large Language Models Hainiu Xu Runcong Zhao Lixing Zhu Bin Liang Yulan He 409 37 0 08 Feb 2024
Empathy and the Right to Be an Exception: What LLMs Can and Cannot Do William Kidder Jason D’Cruz Kush R. Varshney 180 5 0 25 Jan 2024
MMToM-QA: Multimodal Theory of Mind Question AnsweringAnnual Meeting of the Association for Computational Linguistics (ACL), 2024 Chuanyang Jin Yutong Wu Jing Cao Jiannan Xiang Yen-Ling Kuo Zhiting Hu T. Ullman Antonio Torralba Joshua B. Tenenbaum Tianmin Shu 268 68 0 16 Jan 2024
Language Models, Agent Models, and World Models: The LAW for Machine Reasoning and Planning Zhiting Hu Tianmin Shu LLMAG LM&Ro LRM 302 47 0 08 Dec 2023
Generative agent-based modeling with actions grounded in physical, social, or digital space using Concordia A. Vezhnevets J. Agapiou Avia Aharon Ron Ziv Jayd Matyas Edgar A. Duénez-Guzmán William A. Cunningham Simon Osindero Danny Karmon Joel Z Leibo LLMAG LM&Ro AI4CE 305 83 0 06 Dec 2023
Think Twice: Perspective-Taking Improves Large Language Models' Theory-of-Mind Capabilities Alex Wilf Sihyun Shawn Lee Paul Pu Liang Louis-Philippe Morency LRM 281 74 0 16 Nov 2023
Deep Natural Language Feature Learning for Interpretable PredictionConference on Empirical Methods in Natural Language Processing (EMNLP), 2023 Felipe Urrutia Cristian Buc Valentin Barriere 243 3 0 09 Nov 2023
A Graph-to-Text Approach to Knowledge-Grounded Response Generation in Human-Robot Interaction Nicholas Walker Stefan Ultes Pierre Lison LM&Ro 464 1 0 03 Nov 2023
Theory of Mind in Large Language Models: Examining Performance of 11 State-of-the-Art models vs. Children Aged 7-10 on Advanced TestsConference on Computational Natural Language Learning (CoNLL), 2023 Max J. van Duijn Bram van Dijk Tom Kouwenhoven Werner de Valk M. Spruit P. V. D. Putten ELM LRM 292 48 0 31 Oct 2023
Large Language Models: The Need for Nuance in Current Debates and a Pragmatic Perspective on UnderstandingConference on Empirical Methods in Natural Language Processing (EMNLP), 2023 Bram van Dijk Tom Kouwenhoven M. Spruit Max J. van Duijn 292 25 0 30 Oct 2023
Towards A Holistic Landscape of Situated Theory of Mind in Large Language ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2023 Ziqiao Ma Jacob Sansom Run Peng Joyce Chai 266 30 0 30 Oct 2023
FANToM: A Benchmark for Stress-testing Machine Theory of Mind in InteractionsConference on Empirical Methods in Natural Language Processing (EMNLP), 2023 Hyunwoo J. Kim Melanie Sclar Xuhui Zhou Ronan Le Bras Gunhee Kim Yejin Choi Maarten Sap LLMAG 228 123 0 24 Oct 2023
Large Language Models are biased to overestimate profoundnessConference on Empirical Methods in Natural Language Processing (EMNLP), 2023 Eugenio Herrera-Berg Tomás Vergara Browne Pablo León-Villagrá Marc-Lluís Vives Cristian Buc Calderon ELM 94 9 0 22 Oct 2023
Theory of Mind for Multi-Agent Collaboration via Large Language ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2023 Huao Li Yu Quan Chong Simon Stepputtis Joseph Campbell Dana Hughes Michael Lewis Katia Sycara LLMAG 400 122 0 16 Oct 2023
How FaR Are Large Language Models From Agents with Theory-of-Mind? Pei Zhou Aman Madaan Srividya Pranavi Potharaju Aditya Gupta Kevin R. McKee ... Xiang Ren Swaroop Mishra Aida Nematzadeh Shyam Upadhyay Manaal Faruqui LRM AI4CE 177 66 0 04 Oct 2023
From DDMs to DNNs: Using process data and models of decision-making to improve human-AI interactions Mrugsen Nagsen Gopnarayan Jaan Aru S. Gluth AI4CE 185 2 0 29 Aug 2023
Is GPT a Computational Model of Emotion? Detailed Analysis Ala Nekouvaght Tak Jonathan Gratch LLMAG 112 13 0 25 Jul 2023
Personality Traits in Large Language Models Gregory Serapio-García Mustafa Safdari Clément Crepy Luning Sun Stephen Fitz P. Romero Marwa Abdulhai Aleksandra Faust Maja J. Matarić LM&MA LLMAG 669 175 0 01 Jul 2023
Understanding Social Reasoning in Language Models with Language ModelsNeural Information Processing Systems (NeurIPS), 2023 Kanishk Gandhi Jan-Philipp Fränken Tobias Gerstenberg Noah D. Goodman LRM 282 172 0 21 Jun 2023
Developing Effective Educational Chatbots with ChatGPT prompts: Insights from Preliminary Tests in a Case Study on Social Media Literacy (with appendix)IEEE International Conference on Consumer Electronics (ICCE), 2023 Cansu Koyuturk Mona Yavari Emily Theophilou Sathya Bursic Gregor Donabauer ... Raffaele Boiano A. Gabbiadini Davinia Hernández Leo Martin Ruskov D. Ognibene 244 22 0 18 Jun 2023
Can Language Models Teach Weaker Agents? Teacher Explanations Improve Students via Personalization Swarnadeep Saha Peter Hase Mohit Bansal LRM 174 16 0 15 Jun 2023
Inductive reasoning in humans and large language modelsCognitive Systems Research (Cogn. Syst. Res.), 2023 Simon J. Han Keith Ransom Andrew Perfors Charles Kemp ELM ReLM LRM 189 49 0 11 Jun 2023
Minding Language Models' (Lack of) Theory of Mind: A Plug-and-Play Multi-Character Belief TrackerAnnual Meeting of the Association for Computational Linguistics (ACL), 2023 Melanie Sclar Sachin Kumar Peter West Alane Suhr Yejin Choi Yulia Tsvetkov 267 106 0 01 Jun 2023
ReviewerGPT? An Exploratory Study on Using Large Language Models for Paper Reviewing Ryan Liu Nihar B. Shah ELM 188 105 0 01 Jun 2023
Playing repeated games with Large Language ModelsNature Human Behaviour (Nat Hum Behav), 2023 Elif Akata Lion Schulz Julian Coda-Forno Seong Joon Oh Matthias Bethge Eric Schulz 1.1K 189 0 26 May 2023
Comparing Machines and Children: Using Developmental Psychology Experiments to Assess the Strengths and Weaknesses of LaMDA Responses Eliza Kosoy Emily Rose Reagan Leslie Y. Lai Alison Gopnik Danielle Krettek Cobb 212 10 0 18 May 2023
Boosting Theory-of-Mind Performance in Large Language Models via Prompting Shima Rahimi Moghaddam C. Honey LLMAG LRM AI4CE 289 92 0 22 Apr 2023
Dialectical language model evaluation: An initial appraisal of the commonsense spatial reasoning abilities of LLMs Anthony G. Cohn Jose Hernandez-Orallo ELM ReLM LRM 120 32 0 22 Apr 2023
Eight Things to Know about Large Language Models Sam Bowman ALM 291 137 0 02 Apr 2023
Reflexion: Language Agents with Verbal Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2023 Noah Shinn Federico Cassano Beck Labash A. Gopinath Karthik Narasimhan Shunyu Yao LLMAG KELM 585 2,157 0 20 Mar 2023
Evaluating Large Language Models in Theory of Mind TasksProceedings of the National Academy of Sciences of the United States of America (PNAS), 2023 Michal Kosinskihttps://www.semanticscholar.org/me/account LLMAG LRM 561 238 0 04 Feb 2023
Define, Evaluate, and Improve Task-Oriented Cognitive Capabilities for Instruction Generation ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2022 Lingjun Zhao Khanh Nguyen Hal Daumé ELM 229 7 0 21 Dec 2022
A fine-grained comparison of pragmatic language understanding in humans and language modelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2022 Jennifer Hu Sammy Floyd Olessia Jouravlev Evelina Fedorenko E. Gibson 211 82 0 13 Dec 2022
Event knowledge in large language models: the gap between the impossible and the unlikelyCognitive Sciences (CS), 2022 Carina Kauf Anna A. Ivanova Giulia Rambelli Emmanuele Chersoni Jingyuan Selena She Zawad Chowdhury Evelina Fedorenko Alessandro Lenci 461 86 0 02 Dec 2022
Neural Theory-of-Mind? On the Limits of Social Intelligence in Large LMsConference on Empirical Methods in Natural Language Processing (EMNLP), 2022 Maarten Sap Ronan Le Bras Daniel Fried Yejin Choi 337 266 0 24 Oct 2022
Do Large Language Models know what humans know?Cognitive Sciences (CS), 2022 Sean Trott Cameron J. Jones Tyler A. Chang J. Michaelov Benjamin Bergen 278 115 0 04 Sep 2022
Using Large Language Models to Simulate Multiple Humans and Replicate Human Subject StudiesInternational Conference on Machine Learning (ICML), 2022 Gati Aher RosaI. Arriaga Adam Tauman Kalai 591 533 0 18 Aug 2022