ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2302.08399
  4. Cited By
Large Language Models Fail on Trivial Alterations to Theory-of-Mind
  Tasks
v1v2v3v4v5 (latest)

Large Language Models Fail on Trivial Alterations to Theory-of-Mind Tasks

16 February 2023
T. Ullman
    LRM
ArXiv (abs)PDFHTML

Papers citing "Large Language Models Fail on Trivial Alterations to Theory-of-Mind Tasks"

50 / 100 papers shown
Title
MuMA-ToM: Multi-modal Multi-Agent Theory of Mind
MuMA-ToM: Multi-modal Multi-Agent Theory of MindAAAI Conference on Artificial Intelligence (AAAI), 2024
Haojun Shi
Suyu Ye
Xinyu Fang
Chuanyang Jin
Leyla Isik
Yen-Ling Kuo
Tianmin Shu
LLMAG
393
32
0
22 Aug 2024
Large Language Model Recall Uncertainty is Modulated by the Fan Effect
Large Language Model Recall Uncertainty is Modulated by the Fan Effect
Jesse Roberts
Kyle Moore
Thao Pham
Oseremhen Ewaleifoh
Doug Fisher
270
6
0
08 Jul 2024
TimeToM: Temporal Space is the Key to Unlocking the Door of Large
  Language Models' Theory-of-Mind
TimeToM: Temporal Space is the Key to Unlocking the Door of Large Language Models' Theory-of-Mind
Guiyang Hou
Wenqi Zhang
Yongliang Shen
Linjuan Wu
Weiming Lu
LRMAI4CE
176
19
0
01 Jul 2024
Large Language Models Assume People are More Rational than We Really are
Large Language Models Assume People are More Rational than We Really are
Ryan Liu
Jiayi Geng
Joshua C. Peterson
Ilia Sucholutsky
Thomas Griffiths
480
35
0
24 Jun 2024
Dissecting the Ullman Variations with a SCALPEL: Why do LLMs fail at Trivial Alterations to the False Belief Task?
Dissecting the Ullman Variations with a SCALPEL: Why do LLMs fail at Trivial Alterations to the False Belief Task?
Zhiqiang Pi
Annapurna Vadaparty
Benjamin Bergen
Cameron R. Jones
274
4
0
20 Jun 2024
What is the Visual Cognition Gap between Humans and Multimodal LLMs?
What is the Visual Cognition Gap between Humans and Multimodal LLMs?
Xu Cao
Yifan Shen
Bolin Lai
Wenqian Ye
Yunsheng Ma
...
Jintai Chen
Meihuan Huang
Jianguo Cao
Aidong Zhang
James M. Rehg
323
20
0
14 Jun 2024
GPT-ology, Computational Models, Silicon Sampling: How should we think
  about LLMs in Cognitive Science?
GPT-ology, Computational Models, Silicon Sampling: How should we think about LLMs in Cognitive Science?
Desmond C. Ong
283
5
0
13 Jun 2024
A social path to human-like artificial intelligence
A social path to human-like artificial intelligence
Edgar A. Duénez-Guzmán
Suzanne Sadedin
Jane X. Wang
Kevin R. McKee
Joel Z Leibo
GNN
295
36
0
22 May 2024
Evaluating and Modeling Social Intelligence: A Comparative Study of
  Human and AI Capabilities
Evaluating and Modeling Social Intelligence: A Comparative Study of Human and AI Capabilities
Junqi Wang
Chunhui Zhang
Jiapeng Li
Yuxi Ma
Lixing Niu
Jiaheng Han
Yujia Peng
Yixin Zhu
Lifeng Fan
ELMALM
180
9
0
20 May 2024
Elements of World Knowledge (EWoK): A Cognition-Inspired Framework for Evaluating Basic World Knowledge in Language Models
Elements of World Knowledge (EWoK): A Cognition-Inspired Framework for Evaluating Basic World Knowledge in Language ModelsTransactions of the Association for Computational Linguistics (TACL), 2024
Anna A. Ivanova
Aalok Sathe
Benjamin Lipkin
Unnathi Kumar
S. Radkani
...
Leshem Choshen
Roger Levy
Evelina Fedorenko
Josh Tenenbaum
Jacob Andreas
281
54
0
15 May 2024
From Perils to Possibilities: Understanding how Human (and AI) Biases
  affect Online Fora
From Perils to Possibilities: Understanding how Human (and AI) Biases affect Online Fora
Virginia Morini
Valentina Pansanella
Katherine Abramski
Erica Cau
Andrea Failla
Salvatore Citraro
Giulio Rossetti
172
1
0
21 Mar 2024
GOMA: Proactive Embodied Cooperative Communication via Goal-Oriented Mental Alignment
GOMA: Proactive Embodied Cooperative Communication via Goal-Oriented Mental Alignment
Lance Ying
Kunal Jha
Shivam Aarya
Joshua B. Tenenbaum
Antonio Torralba
Tianmin Shu
234
19
0
17 Mar 2024
Views Are My Own, but Also Yours: Benchmarking Theory of Mind Using
  Common Ground
Views Are My Own, but Also Yours: Benchmarking Theory of Mind Using Common Ground
Adil Soubki
John Murzaku
Arash Yousefi Jordehi
Peter Zeng
Magdalena Markowska
Seyed Abolghasem Mirroshandel
Owen Rambow
VLM
216
10
0
04 Mar 2024
OpenToM: A Comprehensive Benchmark for Evaluating Theory-of-Mind
  Reasoning Capabilities of Large Language Models
OpenToM: A Comprehensive Benchmark for Evaluating Theory-of-Mind Reasoning Capabilities of Large Language Models
Hainiu Xu
Runcong Zhao
Lixing Zhu
Bin Liang
Yulan He
409
37
0
08 Feb 2024
Empathy and the Right to Be an Exception: What LLMs Can and Cannot Do
Empathy and the Right to Be an Exception: What LLMs Can and Cannot Do
William Kidder
Jason D’Cruz
Kush R. Varshney
180
5
0
25 Jan 2024
MMToM-QA: Multimodal Theory of Mind Question Answering
MMToM-QA: Multimodal Theory of Mind Question AnsweringAnnual Meeting of the Association for Computational Linguistics (ACL), 2024
Chuanyang Jin
Yutong Wu
Jing Cao
Jiannan Xiang
Yen-Ling Kuo
Zhiting Hu
T. Ullman
Antonio Torralba
Joshua B. Tenenbaum
Tianmin Shu
268
68
0
16 Jan 2024
Language Models, Agent Models, and World Models: The LAW for Machine
  Reasoning and Planning
Language Models, Agent Models, and World Models: The LAW for Machine Reasoning and Planning
Zhiting Hu
Tianmin Shu
LLMAGLM&RoLRM
302
47
0
08 Dec 2023
Generative agent-based modeling with actions grounded in physical,
  social, or digital space using Concordia
Generative agent-based modeling with actions grounded in physical, social, or digital space using Concordia
A. Vezhnevets
J. Agapiou
Avia Aharon
Ron Ziv
Jayd Matyas
Edgar A. Duénez-Guzmán
William A. Cunningham
Simon Osindero
Danny Karmon
Joel Z Leibo
LLMAGLM&RoAI4CE
305
83
0
06 Dec 2023
Think Twice: Perspective-Taking Improves Large Language Models'
  Theory-of-Mind Capabilities
Think Twice: Perspective-Taking Improves Large Language Models' Theory-of-Mind Capabilities
Alex Wilf
Sihyun Shawn Lee
Paul Pu Liang
Louis-Philippe Morency
LRM
281
74
0
16 Nov 2023
Deep Natural Language Feature Learning for Interpretable Prediction
Deep Natural Language Feature Learning for Interpretable PredictionConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Felipe Urrutia
Cristian Buc
Valentin Barriere
243
3
0
09 Nov 2023
A Graph-to-Text Approach to Knowledge-Grounded Response Generation in Human-Robot Interaction
A Graph-to-Text Approach to Knowledge-Grounded Response Generation in Human-Robot Interaction
Nicholas Walker
Stefan Ultes
Pierre Lison
LM&Ro
464
1
0
03 Nov 2023
Theory of Mind in Large Language Models: Examining Performance of 11
  State-of-the-Art models vs. Children Aged 7-10 on Advanced Tests
Theory of Mind in Large Language Models: Examining Performance of 11 State-of-the-Art models vs. Children Aged 7-10 on Advanced TestsConference on Computational Natural Language Learning (CoNLL), 2023
Max J. van Duijn
Bram van Dijk
Tom Kouwenhoven
Werner de Valk
M. Spruit
P. V. D. Putten
ELMLRM
292
48
0
31 Oct 2023
Large Language Models: The Need for Nuance in Current Debates and a
  Pragmatic Perspective on Understanding
Large Language Models: The Need for Nuance in Current Debates and a Pragmatic Perspective on UnderstandingConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Bram van Dijk
Tom Kouwenhoven
M. Spruit
Max J. van Duijn
292
25
0
30 Oct 2023
Towards A Holistic Landscape of Situated Theory of Mind in Large
  Language Models
Towards A Holistic Landscape of Situated Theory of Mind in Large Language ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Ziqiao Ma
Jacob Sansom
Run Peng
Joyce Chai
266
30
0
30 Oct 2023
FANToM: A Benchmark for Stress-testing Machine Theory of Mind in
  Interactions
FANToM: A Benchmark for Stress-testing Machine Theory of Mind in InteractionsConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Hyunwoo J. Kim
Melanie Sclar
Xuhui Zhou
Ronan Le Bras
Gunhee Kim
Yejin Choi
Maarten Sap
LLMAG
228
123
0
24 Oct 2023
Large Language Models are biased to overestimate profoundness
Large Language Models are biased to overestimate profoundnessConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Eugenio Herrera-Berg
Tomás Vergara Browne
Pablo León-Villagrá
Marc-Lluís Vives
Cristian Buc Calderon
ELM
94
9
0
22 Oct 2023
Theory of Mind for Multi-Agent Collaboration via Large Language Models
Theory of Mind for Multi-Agent Collaboration via Large Language ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Huao Li
Yu Quan Chong
Simon Stepputtis
Joseph Campbell
Dana Hughes
Michael Lewis
Katia Sycara
LLMAG
400
122
0
16 Oct 2023
How FaR Are Large Language Models From Agents with Theory-of-Mind?
How FaR Are Large Language Models From Agents with Theory-of-Mind?
Pei Zhou
Aman Madaan
Srividya Pranavi Potharaju
Aditya Gupta
Kevin R. McKee
...
Xiang Ren
Swaroop Mishra
Aida Nematzadeh
Shyam Upadhyay
Manaal Faruqui
LRMAI4CE
177
66
0
04 Oct 2023
From DDMs to DNNs: Using process data and models of decision-making to improve human-AI interactions
From DDMs to DNNs: Using process data and models of decision-making to improve human-AI interactions
Mrugsen Nagsen Gopnarayan
Jaan Aru
S. Gluth
AI4CE
185
2
0
29 Aug 2023
Is GPT a Computational Model of Emotion? Detailed Analysis
Is GPT a Computational Model of Emotion? Detailed Analysis
Ala Nekouvaght Tak
Jonathan Gratch
LLMAG
112
13
0
25 Jul 2023
Personality Traits in Large Language Models
Personality Traits in Large Language Models
Gregory Serapio-García
Mustafa Safdari
Clément Crepy
Luning Sun
Stephen Fitz
P. Romero
Marwa Abdulhai
Aleksandra Faust
Maja J. Matarić
LM&MALLMAG
669
175
0
01 Jul 2023
Understanding Social Reasoning in Language Models with Language Models
Understanding Social Reasoning in Language Models with Language ModelsNeural Information Processing Systems (NeurIPS), 2023
Kanishk Gandhi
Jan-Philipp Fränken
Tobias Gerstenberg
Noah D. Goodman
LRM
282
172
0
21 Jun 2023
Developing Effective Educational Chatbots with ChatGPT prompts: Insights
  from Preliminary Tests in a Case Study on Social Media Literacy (with
  appendix)
Developing Effective Educational Chatbots with ChatGPT prompts: Insights from Preliminary Tests in a Case Study on Social Media Literacy (with appendix)IEEE International Conference on Consumer Electronics (ICCE), 2023
Cansu Koyuturk
Mona Yavari
Emily Theophilou
Sathya Bursic
Gregor Donabauer
...
Raffaele Boiano
A. Gabbiadini
Davinia Hernández Leo
Martin Ruskov
D. Ognibene
244
22
0
18 Jun 2023
Can Language Models Teach Weaker Agents? Teacher Explanations Improve
  Students via Personalization
Can Language Models Teach Weaker Agents? Teacher Explanations Improve Students via Personalization
Swarnadeep Saha
Peter Hase
Mohit Bansal
LRM
174
16
0
15 Jun 2023
Inductive reasoning in humans and large language models
Inductive reasoning in humans and large language modelsCognitive Systems Research (Cogn. Syst. Res.), 2023
Simon J. Han
Keith Ransom
Andrew Perfors
Charles Kemp
ELMReLMLRM
189
49
0
11 Jun 2023
Minding Language Models' (Lack of) Theory of Mind: A Plug-and-Play
  Multi-Character Belief Tracker
Minding Language Models' (Lack of) Theory of Mind: A Plug-and-Play Multi-Character Belief TrackerAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Melanie Sclar
Sachin Kumar
Peter West
Alane Suhr
Yejin Choi
Yulia Tsvetkov
267
106
0
01 Jun 2023
ReviewerGPT? An Exploratory Study on Using Large Language Models for
  Paper Reviewing
ReviewerGPT? An Exploratory Study on Using Large Language Models for Paper Reviewing
Ryan Liu
Nihar B. Shah
ELM
188
105
0
01 Jun 2023
Playing repeated games with Large Language Models
Playing repeated games with Large Language ModelsNature Human Behaviour (Nat Hum Behav), 2023
Elif Akata
Lion Schulz
Julian Coda-Forno
Seong Joon Oh
Matthias Bethge
Eric Schulz
1.1K
189
0
26 May 2023
Comparing Machines and Children: Using Developmental Psychology
  Experiments to Assess the Strengths and Weaknesses of LaMDA Responses
Comparing Machines and Children: Using Developmental Psychology Experiments to Assess the Strengths and Weaknesses of LaMDA Responses
Eliza Kosoy
Emily Rose Reagan
Leslie Y. Lai
Alison Gopnik
Danielle Krettek Cobb
212
10
0
18 May 2023
Boosting Theory-of-Mind Performance in Large Language Models via
  Prompting
Boosting Theory-of-Mind Performance in Large Language Models via Prompting
Shima Rahimi Moghaddam
C. Honey
LLMAGLRMAI4CE
289
92
0
22 Apr 2023
Dialectical language model evaluation: An initial appraisal of the
  commonsense spatial reasoning abilities of LLMs
Dialectical language model evaluation: An initial appraisal of the commonsense spatial reasoning abilities of LLMs
Anthony G. Cohn
Jose Hernandez-Orallo
ELMReLMLRM
120
32
0
22 Apr 2023
Eight Things to Know about Large Language Models
Eight Things to Know about Large Language Models
Sam Bowman
ALM
291
137
0
02 Apr 2023
Reflexion: Language Agents with Verbal Reinforcement Learning
Reflexion: Language Agents with Verbal Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2023
Noah Shinn
Federico Cassano
Beck Labash
A. Gopinath
Karthik Narasimhan
Shunyu Yao
LLMAGKELM
585
2,157
0
20 Mar 2023
Evaluating Large Language Models in Theory of Mind Tasks
Evaluating Large Language Models in Theory of Mind TasksProceedings of the National Academy of Sciences of the United States of America (PNAS), 2023
Michal Kosinskihttps://www.semanticscholar.org/me/account
LLMAGLRM
561
238
0
04 Feb 2023
Define, Evaluate, and Improve Task-Oriented Cognitive Capabilities for
  Instruction Generation Models
Define, Evaluate, and Improve Task-Oriented Cognitive Capabilities for Instruction Generation ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2022
Lingjun Zhao
Khanh Nguyen
Hal Daumé
ELM
229
7
0
21 Dec 2022
A fine-grained comparison of pragmatic language understanding in humans
  and language models
A fine-grained comparison of pragmatic language understanding in humans and language modelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2022
Jennifer Hu
Sammy Floyd
Olessia Jouravlev
Evelina Fedorenko
E. Gibson
211
82
0
13 Dec 2022
Event knowledge in large language models: the gap between the impossible
  and the unlikely
Event knowledge in large language models: the gap between the impossible and the unlikelyCognitive Sciences (CS), 2022
Carina Kauf
Anna A. Ivanova
Giulia Rambelli
Emmanuele Chersoni
Jingyuan Selena She
Zawad Chowdhury
Evelina Fedorenko
Alessandro Lenci
461
86
0
02 Dec 2022
Neural Theory-of-Mind? On the Limits of Social Intelligence in Large LMs
Neural Theory-of-Mind? On the Limits of Social Intelligence in Large LMsConference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Maarten Sap
Ronan Le Bras
Daniel Fried
Yejin Choi
337
266
0
24 Oct 2022
Do Large Language Models know what humans know?
Do Large Language Models know what humans know?Cognitive Sciences (CS), 2022
Sean Trott
Cameron J. Jones
Tyler A. Chang
J. Michaelov
Benjamin Bergen
278
115
0
04 Sep 2022
Using Large Language Models to Simulate Multiple Humans and Replicate
  Human Subject Studies
Using Large Language Models to Simulate Multiple Humans and Replicate Human Subject StudiesInternational Conference on Machine Learning (ICML), 2022
Gati Aher
RosaI. Arriaga
Adam Tauman Kalai
591
533
0
18 Aug 2022
Previous
12