ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2310.16755
  4. Cited By
HI-TOM: A Benchmark for Evaluating Higher-Order Theory of Mind Reasoning
  in Large Language Models

HI-TOM: A Benchmark for Evaluating Higher-Order Theory of Mind Reasoning in Large Language Models

25 October 2023
Yinghui He
Yufan Wu
Yilin Jia
Rada Mihalcea
Yulong Chen
Naihao Deng
    LRM
    LLMAG
ArXivPDFHTML

Papers citing "HI-TOM: A Benchmark for Evaluating Higher-Order Theory of Mind Reasoning in Large Language Models"

21 / 21 papers shown
Title
R^3-VQA: "Read the Room" by Video Social Reasoning
R^3-VQA: "Read the Room" by Video Social Reasoning
Lixing Niu
Jiapeng Li
Xingping Yu
Shu Wang
Ruining Feng
Bo Wu
Ping Wei
Y. Wang
Lifeng Fan
43
0
0
07 May 2025
Sentient Agent as a Judge: Evaluating Higher-Order Social Cognition in Large Language Models
Sentient Agent as a Judge: Evaluating Higher-Order Social Cognition in Large Language Models
Bang Zhang
Ruotian Ma
Qingxuan Jiang
Peisong Wang
Jiaqi Chen
...
Fanghua Ye
Jian Li
Yifan Yang
Zhaopeng Tu
Xiaolong Li
LLMAG
ELM
ALM
102
25
1
01 May 2025
AI Awareness
AI Awareness
X. Li
Haoyuan Shi
Rongwu Xu
Wei Xu
54
0
0
25 Apr 2025
Assesing LLMs in Art Contexts: Critique Generation and Theory of Mind Evaluation
Assesing LLMs in Art Contexts: Critique Generation and Theory of Mind Evaluation
Takaya Arita
Wenxian Zheng
Reiji Suzuki
Fuminori Akiba
22
0
0
17 Apr 2025
EmoAgent: Assessing and Safeguarding Human-AI Interaction for Mental Health Safety
EmoAgent: Assessing and Safeguarding Human-AI Interaction for Mental Health Safety
Jiahao Qiu
Yinghui He
Xinzhe Juan
Y. Wang
Y. Liu
Zixin Yao
Yue Wu
Xun Jiang
L. Yang
Mengdi Wang
AI4MH
68
0
0
13 Apr 2025
Sensitivity Meets Sparsity: The Impact of Extremely Sparse Parameter Patterns on Theory-of-Mind of Large Language Models
Sensitivity Meets Sparsity: The Impact of Extremely Sparse Parameter Patterns on Theory-of-Mind of Large Language Models
Yuheng Wu
Wentao Guo
Zirui Liu
Heng Ji
Zhaozhuo Xu
Denghui Zhang
33
0
0
05 Apr 2025
ToM-RL: Reinforcement Learning Unlocks Theory of Mind in Small LLMs
ToM-RL: Reinforcement Learning Unlocks Theory of Mind in Small LLMs
Yi-Long Lu
Chunhui Zhang
Jiajun Song
Lifeng Fan
Wei Wang
OffRL
46
0
0
02 Apr 2025
The Mind in the Machine: A Survey of Incorporating Psychological Theories in LLMs
The Mind in the Machine: A Survey of Incorporating Psychological Theories in LLMs
Zizhou Liu
Ziwei Gong
Lin Ai
Zheng Hui
Run Chen
Colin Wayne Leach
Michelle R. Greene
Julia Hirschberg
LLMAG
99
0
0
28 Mar 2025
Persuasion Should be Double-Blind: A Multi-Domain Dialogue Dataset With Faithfulness Based on Causal Theory of Mind
Persuasion Should be Double-Blind: A Multi-Domain Dialogue Dataset With Faithfulness Based on Causal Theory of Mind
Dingyi Zhang
Deyu Zhou
61
1
0
28 Feb 2025
Re-evaluating Theory of Mind evaluation in large language models
Re-evaluating Theory of Mind evaluation in large language models
Jennifer Hu
Felix Sosa
T. Ullman
40
0
0
28 Feb 2025
Hypothesis-Driven Theory-of-Mind Reasoning for Large Language Models
Hypothesis-Driven Theory-of-Mind Reasoning for Large Language Models
Hyunwoo Kim
Melanie Sclar
Tan Zhi-Xuan
Lance Ying
Sydney Levine
Yang Liu
Joshua B. Tenenbaum
Yejin Choi
LRM
LLMAG
49
0
0
17 Feb 2025
Large Language Models as Theory of Mind Aware Generative Agents with Counterfactual Reflection
Bo Yang
Jiaxian Guo
Yusuke Iwasawa
Y. Matsuo
AI4CE
37
1
0
28 Jan 2025
Belief in the Machine: Investigating Epistemological Blind Spots of
  Language Models
Belief in the Machine: Investigating Epistemological Blind Spots of Language Models
Mirac Suzgun
Tayfun Gur
Federico Bianchi
Daniel E. Ho
Thomas F. Icard
Dan Jurafsky
James Zou
29
1
0
28 Oct 2024
SimpleToM: Exposing the Gap between Explicit ToM Inference and Implicit
  ToM Application in LLMs
SimpleToM: Exposing the Gap between Explicit ToM Inference and Implicit ToM Application in LLMs
Yuling Gu
Oyvind Tafjord
Hyunwoo Kim
Jared Moore
Ronan Le Bras
Peter Clark
Yejin Choi
28
8
0
17 Oct 2024
EgoSocialArena: Benchmarking the Social Intelligence of Large Language Models from a First-person Perspective
EgoSocialArena: Benchmarking the Social Intelligence of Large Language Models from a First-person Perspective
Guiyang Hou
Wenqi Zhang
Yongliang Shen
Zeqi Tan
Sihao Shen
Weiming Lu
31
0
0
08 Oct 2024
LogicGame: Benchmarking Rule-Based Reasoning Abilities of Large Language
  Models
LogicGame: Benchmarking Rule-Based Reasoning Abilities of Large Language Models
Jiayi Gui
Yiming Liu
Jiale Cheng
Xiaotao Gu
Xiao-Yang Liu
Hongning Wang
Yuxiao Dong
Jie Tang
Minlie Huang
ELM
LLMAG
LRM
32
2
0
28 Aug 2024
Benchmarking Mental State Representations in Language Models
Benchmarking Mental State Representations in Language Models
Matteo Bortoletto
Constantin Ruhdorfer
Lei Shi
Andreas Bulling
AI4MH
LRM
44
4
0
25 Jun 2024
LLMs achieve adult human performance on higher-order theory of mind
  tasks
LLMs achieve adult human performance on higher-order theory of mind tasks
Winnie Street
John Oliver Siy
Geoff Keeling
Adrien Baranes
Benjamin Barnett
Michael McKibben
Tatenda Kanyere
Alison Lentz
Blaise Agüera y Arcas
Robin I. M. Dunbar
LRM
44
32
0
29 May 2024
Cognitive Insights and Stable Coalition Matching for Fostering Multi-Agent Cooperation
Cognitive Insights and Stable Coalition Matching for Fostering Multi-Agent Cooperation
Jiaqi Shao
Tianjun Yuan
Tao Lin
Xuanyu Cao
40
0
0
28 May 2024
Generative Agents: Interactive Simulacra of Human Behavior
Generative Agents: Interactive Simulacra of Human Behavior
J. Park
Joseph C. O'Brien
Carrie J. Cai
Meredith Ringel Morris
Percy Liang
Michael S. Bernstein
LM&Ro
AI4CE
215
1,733
0
07 Apr 2023
Sparks of Artificial General Intelligence: Early experiments with GPT-4
Sparks of Artificial General Intelligence: Early experiments with GPT-4
Sébastien Bubeck
Varun Chandrasekaran
Ronen Eldan
J. Gehrke
Eric Horvitz
...
Scott M. Lundberg
Harsha Nori
Hamid Palangi
Marco Tulio Ribeiro
Yi Zhang
ELM
AI4MH
AI4CE
ALM
251
2,232
0
22 Mar 2023
1