Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2410.10479
Cited By
v1
v2 (latest)
TMGBench: A Systematic Game Benchmark for Evaluating Strategic Reasoning Abilities of LLMs
14 October 2024
Jian Shu
Xiachong Feng
Lei Li
Zhan Qin
Dianbo Sui
Dianbo Sui
Lingpeng Kong
LRM
ELM
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (1 upvotes)
Papers citing
"TMGBench: A Systematic Game Benchmark for Evaluating Strategic Reasoning Abilities of LLMs"
44 / 44 papers shown
Title
Communication Enables Cooperation in LLM Agents: A Comparison with Curriculum-Based Approaches
Hachem Madmoun
Salem Lahlou
52
0
0
07 Oct 2025
CHBench: A Cognitive Hierarchy Benchmark for Evaluating Strategic Reasoning Capability of LLMs
Hongtao Liu
Zhicheng Du
Zihe Wang
Weiran Shen
LRM
84
0
0
16 Aug 2025
The Effect of State Representation on LLM Agent Behavior in Dynamic Routing Games
Lyle Goodyear
Rachel Guo
Ramesh Johari
203
3
0
18 Jun 2025
Agents Require Metacognitive and Strategic Reasoning to Succeed in the Coming Labor Markets
Simpson Zhang
Tennison Liu
M. Schaar
LLMAG
181
0
0
26 May 2025
PLANET: A Collection of Benchmarks for Evaluating LLMs' Planning Capabilities
Haoming Li
Zhaoliang Chen
Jonathan Zhang
Fei Liu
LLMAG
317
4
0
21 Apr 2025
Gemma 3 Technical Report
Gemma Team
Aishwarya B Kamath
Johan Ferret
Shreya Pathak
Nino Vieillard
...
Harshal Tushar Lehri
Hussein Hazimeh
Ian Ballantyne
Idan Szpektor
Ivan Nardini
VLM
397
677
0
25 Mar 2025
Towards properly implementing Theory of Mind in AI systems: An account of four misconceptions
Ramira van der Meulen
Rineke Verbrugge
Max van Duijn
142
0
0
28 Feb 2025
Game Theory Meets Large Language Models: A Systematic Survey with Taxonomy and New Frontiers
Haoran Sun
Yusen Wu
Yukun Cheng
Wei Chen
Yukun Cheng
X. Deng
Xu Chu
OffRL
LM&MA
AI4CE
442
5
0
13 Feb 2025
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
DeepSeek-AI
Daya Guo
Dejian Yang
Haowei Zhang
Junxiao Song
...
Shiyu Wang
S. Yu
Shunfeng Zhou
Shuting Pan
S.S. Li
ReLM
VLM
OffRL
AI4TS
LRM
990
5,096
0
22 Jan 2025
A Survey on Large Language Model-Based Social Agents in Game-Theoretic Scenarios
Xiachong Feng
Longxu Dou
Ella Li
Qinghao Wang
Jian Shu
Yu Guo
Chang Ma
Lingpeng Kong
AI4CE
LM&Ro
LM&MA
ELM
LLMAG
361
14
0
05 Dec 2024
Game-theoretic LLM: Agent Workflow for Negotiation Games
Qingfeng Lan
Ollie Liu
Jinkui Chi
Alfonso Amayuelas
Julie Chen
...
Lizhou Fan
Fei Sun
William Yang Wang
Xinze Wang
Zelong Li
244
43
0
08 Nov 2024
GLEE: A Unified Framework and Benchmark for Language-based Economic Environments
Eilam Shapira
Omer Madmon
Itamar Reinman
S. Amouyal
Roi Reichart
Moshe Tennenholtz
298
13
0
07 Oct 2024
Evaluating and Enhancing LLMs Agent based on Theory of Mind in Guandan: A Multi-Player Cooperative Game under Imperfect Information
Yauwai Yim
Chunkit Chan
Tianyu Shi
Zheye Deng
Wei Fan
Tianshi Zheng
Yangqiu Song
LLMAG
243
19
0
05 Aug 2024
Knowledge Mechanisms in Large Language Models: A Survey and Perspective
Meng Wang
Yunzhi Yao
Ziwen Xu
Shuofei Qiao
Shumin Deng
...
Yong Jiang
Pengjun Xie
Fei Huang
Huajun Chen
Ningyu Zhang
285
58
0
22 Jul 2024
Are Large Language Models Strategic Decision Makers? A Study of Performance and Bias in Two-Player Non-Zero-Sum Games
Nathan Herr
Fernando Acero
Roberta Raileanu
María Pérez-Ortiz
Zhibin Li
LRM
257
4
0
05 Jul 2024
InterIntent: Investigating Social Intelligence of LLMs via Intention Understanding in an Interactive Game Context
Ziyi Liu
Abhishek Anand
Pei Zhou
Jen-tse Huang
Jieyu Zhao
252
21
0
18 Jun 2024
LLM as a Mastermind: A Survey of Strategic Reasoning with Large Language Models
Yadong Zhang
Shaoguang Mao
Tao Ge
Xun Wang
Adrian de Wynter
Yan Xia
Wenshan Wu
Ting Song
Man Lan
Furu Wei
LRM
320
99
0
01 Apr 2024
How Far Are We on the Decision-Making of LLMs? Evaluating LLMs' Gaming Ability in Multi-Agent Environments
Shu Yang
E. Li
Man Ho Lam
Tian Liang
Wenxuan Wang
Youliang Yuan
Wenxiang Jiao
Xing Wang
Zhaopeng Tu
Michael R. Lyu
ELM
LLMAG
422
52
0
18 Mar 2024
GTBench: Uncovering the Strategic Reasoning Limitations of LLMs via Game-Theoretic Evaluations
Jinhao Duan
Renming Zhang
James Diffenderfer
B. Kailkhura
Lichao Sun
Elias Stengel-Eskin
Mohit Bansal
Tianlong Chen
Kaidi Xu
ELM
LRM
230
87
0
19 Feb 2024
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
Zhihong Shao
Peiyi Wang
Qihao Zhu
Runxin Xu
Jun-Mei Song
...
Haowei Zhang
Mingchuan Zhang
Yiming Li
Yu-Huan Wu
Daya Guo
ReLM
LRM
1.0K
3,433
0
05 Feb 2024
K-Level Reasoning with Large Language Models
Yadong Zhang
Shaoguang Mao
Tao Ge
Xun Wang
Yan Xia
Man Lan
Furu Wei
LRM
ReLM
161
8
0
02 Feb 2024
Can Large Language Models Serve as Rational Players in Game Theory? A Systematic Analysis
AAAI Conference on Artificial Intelligence (AAAI), 2023
Caoyun Fan
Jindou Chen
Yaohui Jin
Hao He
176
93
0
09 Dec 2023
MAgIC: Investigation of Large Language Model Powered Multi-Agent in Cognition, Adaptability, Rationality and Collaboration
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Lin Xu
Zhiyuan Hu
Daquan Zhou
Hongyu Ren
Zhen Dong
Kurt Keutzer
See Kiong Ng
Jiashi Feng
LRM
LLMAG
ELM
155
51
0
14 Nov 2023
Suspicion-Agent: Playing Imperfect Information Games with Theory of Mind Aware GPT-4
Jiaxian Guo
Bo Yang
Paul D. Yoo
Bill Yuchen Lin
Yusuke Iwasawa
Yutaka Matsuo
LLMAG
265
59
0
29 Sep 2023
Strategic Behavior of Large Language Models: Game Structure vs. Contextual Framing
Social Science Research Network (SSRN), 2023
Nunzio Lorè
Babak Heydari
126
49
0
12 Sep 2023
Beyond Static Datasets: A Deep Interaction Approach to LLM Evaluation
Jiatong Li
Rui Li
Qi Liu
154
27
0
08 Sep 2023
Boosting Logical Reasoning in Large Language Models through a New Framework: The Graph of Thought
Bin Lei
Pei-Hung Lin
C. Liao
Caiwen Ding
ReLM
ELM
LRM
AI4CE
146
46
0
16 Aug 2023
Exploring the MIT Mathematics and EECS Curriculum Using Large Language Models
Sarah J. Zhang
Samuel H. Florin
Ariel N. Lee
Eamon Niknafs
Andrei Marginean
...
Madeleine Udell
Yoon Kim
Tonio Buonassisi
Armando Solar-Lezama
Iddo Drori
ELM
147
20
0
15 Jun 2023
Strategic Reasoning with Language Models
Kanishk Gandhi
Dorsa Sadigh
Noah D. Goodman
LM&Ro
LRM
134
53
0
30 May 2023
Playing repeated games with Large Language Models
Nature Human Behaviour (Nat Hum Behav), 2023
Elif Akata
Lion Schulz
Julian Coda-Forno
Seong Joon Oh
Matthias Bethge
Eric Schulz
1.0K
183
0
26 May 2023
Large Language Models as Commonsense Knowledge for Large-Scale Task Planning
Neural Information Processing Systems (NeurIPS), 2023
Zirui Zhao
W. Lee
David Hsu
LRM
LLMAG
LM&Ro
330
311
0
23 May 2023
The Machine Psychology of Cooperation: Can GPT models operationalise prompts for altruism, cooperation, competitiveness and selfishness in economic games?
S. Phelps
Y. Russell
236
24
0
13 May 2023
Causal Reasoning and Large Language Models: Opening a New Frontier for Causality
Emre Kıcıman
Robert Osazuwa Ness
Amit Sharma
Chenhao Tan
LRM
ELM
428
371
0
28 Apr 2023
ART: Automatic multi-step reasoning and tool-use for large language models
Bhargavi Paranjape
Scott M. Lundberg
Sameer Singh
Hannaneh Hajishirzi
Luke Zettlemoyer
Marco Tulio Ribeiro
KELM
ReLM
LRM
259
187
0
16 Mar 2023
MathPrompter: Mathematical Reasoning using Large Language Models
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Shima Imani
Liang Du
H. Shrivastava
KELM
ReLM
LRM
198
259
0
04 Mar 2023
Large Language Models Fail on Trivial Alterations to Theory-of-Mind Tasks
T. Ullman
LRM
281
291
0
16 Feb 2023
Towards Reasoning in Large Language Models: A Survey
Annual Meeting of the Association for Computational Linguistics (ACL), 2022
Jie Huang
Kevin Chen-Chuan Chang
LM&MA
ELM
LRM
756
787
0
20 Dec 2022
Using Large Language Models to Simulate Multiple Humans and Replicate Human Subject Studies
International Conference on Machine Learning (ICML), 2022
Gati Aher
RosaI. Arriaga
Adam Tauman Kalai
479
530
0
18 Aug 2022
Inner Monologue: Embodied Reasoning through Planning with Language Models
Conference on Robot Learning (CoRL), 2022
Wenlong Huang
F. Xia
Ted Xiao
Harris Chan
Jacky Liang
...
Tomas Jackson
Linda Luu
Sergey Levine
Karol Hausman
Brian Ichter
LLMAG
LM&Ro
LRM
316
1,132
0
12 Jul 2022
Solving Quantitative Reasoning Problems with Language Models
Neural Information Processing Systems (NeurIPS), 2022
Aitor Lewkowycz
Anders Andreassen
David Dohan
Ethan Dyer
Henryk Michalewski
...
Theo Gutman-Solo
Yuhuai Wu
Behnam Neyshabur
Guy Gur-Ari
Vedant Misra
ReLM
ELM
LRM
584
1,249
0
29 Jun 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Neural Information Processing Systems (NeurIPS), 2022
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
2.1K
13,906
0
28 Jan 2022
Recent Advances in Natural Language Processing via Large Pre-Trained Language Models: A Survey
ACM Computing Surveys (CSUR), 2021
Bonan Min
Hayley L Ross
Elior Sulem
Amir Pouran Ben Veyseh
Thien Huu Nguyen
Oscar Sainz
Eneko Agirre
Ilana Heinz
Dan Roth
LM&MA
VLM
AI4CE
354
1,322
0
01 Nov 2021
Measuring Mathematical Problem Solving With the MATH Dataset
Dan Hendrycks
Collin Burns
Saurav Kadavath
Akul Arora
Steven Basart
Eric Tang
Basel Alomair
Jacob Steinhardt
ReLM
FaML
699
3,710
0
05 Mar 2021
PIQA: Reasoning about Physical Commonsense in Natural Language
AAAI Conference on Artificial Intelligence (AAAI), 2019
Yonatan Bisk
Rowan Zellers
Ronan Le Bras
Jianfeng Gao
Yejin Choi
OOD
LRM
1.1K
2,415
0
26 Nov 2019
1