Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2406.14737
Cited By
v1
v2 (latest)
Dissecting the Ullman Variations with a SCALPEL: Why do LLMs fail at Trivial Alterations to the False Belief Task?
20 June 2024
Zhiqiang Pi
Annapurna Vadaparty
Benjamin Bergen
Cameron R. Jones
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Dissecting the Ullman Variations with a SCALPEL: Why do LLMs fail at Trivial Alterations to the False Belief Task?"
16 / 16 papers shown
Title
Can Vision Language Models Infer Human Gaze Direction? A Controlled Study
Zory Zhang
Pinyuan Feng
Bingyang Wang
Tianwei Zhao
Suyang Yu
Qingying Gao
Hokin Deng
Ziqiao Ma
Yijiang Li
Dezhi Luo
25
0
0
04 Jun 2025
Re-evaluating Theory of Mind evaluation in large language models
Jennifer Hu
Felix Sosa
T. Ullman
154
2
0
28 Feb 2025
Mind Your Theory: Theory of Mind Goes Deeper Than Reasoning
Eitan Wagner
Nitay Alon
J. Barnby
Omri Abend
LRM
205
2
0
18 Dec 2024
Auxiliary task demands mask the capabilities of smaller language models
Jennifer Hu
Michael C. Frank
ELM
94
32
0
03 Apr 2024
FANToM: A Benchmark for Stress-testing Machine Theory of Mind in Interactions
Hyunwoo J. Kim
Melanie Sclar
Xuhui Zhou
Ronan Le Bras
Gunhee Kim
Yejin Choi
Maarten Sap
LLMAG
86
92
0
24 Oct 2023
Understanding Social Reasoning in Language Models with Language Models
Kanishk Gandhi
Jan-Philipp Fränken
Tobias Gerstenberg
Noah D. Goodman
LRM
76
126
0
21 Jun 2023
Turning large language models into cognitive models
Marcel Binz
Eric Schulz
100
63
0
06 Jun 2023
Clever Hans or Neural Theory of Mind? Stress Testing Social Reasoning in Large Language Models
Natalie Shapira
Mosh Levy
S. Alavi
Xuhui Zhou
Yejin Choi
Yoav Goldberg
Maarten Sap
Vered Shwartz
LLMAG
ELM
116
128
0
24 May 2023
Machine Psychology: Investigating Emergent Capabilities and Behavior in Large Language Models Using Psychological Methods
Thilo Hagendorff
LLMAG
132
72
0
24 Mar 2023
Large Language Models Fail on Trivial Alterations to Theory-of-Mind Tasks
T. Ullman
LRM
82
241
0
16 Feb 2023
Evaluating Large Language Models in Theory of Mind Tasks
Michal Kosinskihttps://www.semanticscholar.org/me/account
LLMAG
LRM
102
141
0
04 Feb 2023
The Debate Over Understanding in AI's Large Language Models
Melanie Mitchell
D. Krakauer
ELM
155
222
0
14 Oct 2022
Do Large Language Models know what humans know?
Sean Trott
Cameron J. Jones
Tyler A. Chang
J. Michaelov
Benjamin Bergen
93
97
0
04 Sep 2022
Right for the Wrong Reasons: Diagnosing Syntactic Heuristics in Natural Language Inference
R. Thomas McCoy
Ellie Pavlick
Tal Linzen
159
1,244
0
04 Feb 2019
Targeted Syntactic Evaluation of Language Models
Rebecca Marvin
Tal Linzen
99
417
0
27 Aug 2018
Stress Test Evaluation for Natural Language Inference
Aakanksha Naik
Abhilasha Ravichander
Norman M. Sadeh
Carolyn Rose
Graham Neubig
ELM
121
380
0
02 Jun 2018
1