Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2406.17513
Cited By
Benchmarking Mental State Representations in Language Models
25 June 2024
Matteo Bortoletto
Constantin Ruhdorfer
Lei Shi
Andreas Bulling
AI4MH
LRM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Benchmarking Mental State Representations in Language Models"
17 / 17 papers shown
Title
A Survey of Theory of Mind in Large Language Models: Evaluations, Representations, and Safety Risks
Hieu Minh "Jord" Nguyen
LM&MA
LRM
49
0
0
10 Feb 2025
Mind Your Theory: Theory of Mind Goes Deeper Than Reasoning
Eitan Wagner
Nitay Alon
J. Barnby
Omri Abend
LRM
79
2
0
18 Dec 2024
Explicit Modelling of Theory of Mind for Belief Prediction in Nonverbal Social Interactions
Matteo Bortoletto
Constantin Ruhdorfer
Lei Shi
Andreas Bulling
22
1
0
09 Jul 2024
OpenToM: A Comprehensive Benchmark for Evaluating Theory-of-Mind Reasoning Capabilities of Large Language Models
Hainiu Xu
Runcong Zhao
Lixing Zhu
Jinhua Du
Yulan He
66
18
0
08 Feb 2024
DeepSeek LLM: Scaling Open-Source Language Models with Longtermism
DeepSeek-AI Xiao Bi
:
Xiao Bi
Deli Chen
Guanting Chen
...
Yao Zhao
Shangyan Zhou
Shunfeng Zhou
Qihao Zhu
Yuheng Zou
LRM
ALM
131
298
0
05 Jan 2024
Neural Reasoning About Agents' Goals, Preferences, and Actions
Matteo Bortoletto
Lei Shi
Andreas Bulling
26
5
0
12 Dec 2023
Cognitive Dissonance: Why Do Language Model Outputs Disagree with Internal Representations of Truthfulness?
Kevin Liu
Stephen Casper
Dylan Hadfield-Menell
Jacob Andreas
HILM
44
35
0
27 Nov 2023
The Geometry of Truth: Emergent Linear Structure in Large Language Model Representations of True/False Datasets
Samuel Marks
Max Tegmark
HILM
91
164
0
10 Oct 2023
Finding Neurons in a Haystack: Case Studies with Sparse Probing
Wes Gurnee
Neel Nanda
Matthew Pauly
Katherine Harvey
Dmitrii Troitskii
Dimitris Bertsimas
MILM
153
170
0
02 May 2023
Sparks of Artificial General Intelligence: Early experiments with GPT-4
Sébastien Bubeck
Varun Chandrasekaran
Ronen Eldan
J. Gehrke
Eric Horvitz
...
Scott M. Lundberg
Harsha Nori
Hamid Palangi
Marco Tulio Ribeiro
Yi Zhang
ELM
AI4MH
AI4CE
ALM
197
2,953
0
22 Mar 2023
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
301
11,730
0
04 Mar 2022
MindCraft: Theory of Mind Modeling for Situated Dialogue in Collaborative Tasks
Cristian-Paul Bara
Sky CH-Wang
J. Chai
62
61
0
13 Sep 2021
Fantastically Ordered Prompts and Where to Find Them: Overcoming Few-Shot Prompt Order Sensitivity
Yao Lu
Max Bartolo
Alastair Moore
Sebastian Riedel
Pontus Stenetorp
AILaw
LRM
274
882
0
18 Apr 2021
Probing Classifiers: Promises, Shortcomings, and Advances
Yonatan Belinkov
219
291
0
24 Feb 2021
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
Leo Gao
Stella Biderman
Sid Black
Laurence Golding
Travis Hoppe
...
Horace He
Anish Thite
Noa Nabeshima
Shawn Presser
Connor Leahy
AIMat
236
1,508
0
31 Dec 2020
Language Models as Knowledge Bases?
Fabio Petroni
Tim Rocktaschel
Patrick Lewis
A. Bakhtin
Yuxiang Wu
Alexander H. Miller
Sebastian Riedel
KELM
AI4MH
391
2,216
0
03 Sep 2019
What you can cram into a single vector: Probing sentence embeddings for linguistic properties
Alexis Conneau
Germán Kruszewski
Guillaume Lample
Loïc Barrault
Marco Baroni
196
876
0
03 May 2018
1