ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2410.10479
  4. Cited By
TMGBench: A Systematic Game Benchmark for Evaluating Strategic Reasoning Abilities of LLMs
v1v2 (latest)

TMGBench: A Systematic Game Benchmark for Evaluating Strategic Reasoning Abilities of LLMs

14 October 2024
Jian Shu
Xiachong Feng
Lei Li
Zhan Qin
Dianbo Sui
Dianbo Sui
Lingpeng Kong
    LRMELM
ArXiv (abs)PDFHTMLHuggingFace (1 upvotes)

Papers citing "TMGBench: A Systematic Game Benchmark for Evaluating Strategic Reasoning Abilities of LLMs"

44 / 44 papers shown
Title
Communication Enables Cooperation in LLM Agents: A Comparison with Curriculum-Based Approaches
Communication Enables Cooperation in LLM Agents: A Comparison with Curriculum-Based Approaches
Hachem Madmoun
Salem Lahlou
52
0
0
07 Oct 2025
CHBench: A Cognitive Hierarchy Benchmark for Evaluating Strategic Reasoning Capability of LLMs
CHBench: A Cognitive Hierarchy Benchmark for Evaluating Strategic Reasoning Capability of LLMs
Hongtao Liu
Zhicheng Du
Zihe Wang
Weiran Shen
LRM
84
0
0
16 Aug 2025
The Effect of State Representation on LLM Agent Behavior in Dynamic Routing Games
The Effect of State Representation on LLM Agent Behavior in Dynamic Routing Games
Lyle Goodyear
Rachel Guo
Ramesh Johari
203
3
0
18 Jun 2025
Agents Require Metacognitive and Strategic Reasoning to Succeed in the Coming Labor Markets
Agents Require Metacognitive and Strategic Reasoning to Succeed in the Coming Labor Markets
Simpson Zhang
Tennison Liu
M. Schaar
LLMAG
181
0
0
26 May 2025
PLANET: A Collection of Benchmarks for Evaluating LLMs' Planning Capabilities
PLANET: A Collection of Benchmarks for Evaluating LLMs' Planning Capabilities
Haoming Li
Zhaoliang Chen
Jonathan Zhang
Fei Liu
LLMAG
317
4
0
21 Apr 2025
Gemma 3 Technical Report
Gemma 3 Technical Report
Gemma Team
Aishwarya B Kamath
Johan Ferret
Shreya Pathak
Nino Vieillard
...
Harshal Tushar Lehri
Hussein Hazimeh
Ian Ballantyne
Idan Szpektor
Ivan Nardini
VLM
397
677
0
25 Mar 2025
Towards properly implementing Theory of Mind in AI systems: An account of four misconceptions
Towards properly implementing Theory of Mind in AI systems: An account of four misconceptions
Ramira van der Meulen
Rineke Verbrugge
Max van Duijn
142
0
0
28 Feb 2025
Game Theory Meets Large Language Models: A Systematic Survey with Taxonomy and New Frontiers
Game Theory Meets Large Language Models: A Systematic Survey with Taxonomy and New Frontiers
Haoran Sun
Yusen Wu
Yukun Cheng
Wei Chen
Yukun Cheng
X. Deng
Xu Chu
OffRLLM&MAAI4CE
442
5
0
13 Feb 2025
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
DeepSeek-AI
Daya Guo
Dejian Yang
Haowei Zhang
Junxiao Song
...
Shiyu Wang
S. Yu
Shunfeng Zhou
Shuting Pan
S.S. Li
ReLMVLMOffRLAI4TSLRM
990
5,096
0
22 Jan 2025
A Survey on Large Language Model-Based Social Agents in Game-Theoretic Scenarios
A Survey on Large Language Model-Based Social Agents in Game-Theoretic Scenarios
Xiachong Feng
Longxu Dou
Ella Li
Qinghao Wang
Jian Shu
Yu Guo
Chang Ma
Lingpeng Kong
AI4CELM&RoLM&MAELMLLMAG
361
14
0
05 Dec 2024
Game-theoretic LLM: Agent Workflow for Negotiation Games
Game-theoretic LLM: Agent Workflow for Negotiation Games
Qingfeng Lan
Ollie Liu
Jinkui Chi
Alfonso Amayuelas
Julie Chen
...
Lizhou Fan
Fei Sun
William Yang Wang
Xinze Wang
Zelong Li
244
43
0
08 Nov 2024
GLEE: A Unified Framework and Benchmark for Language-based Economic Environments
GLEE: A Unified Framework and Benchmark for Language-based Economic Environments
Eilam Shapira
Omer Madmon
Itamar Reinman
S. Amouyal
Roi Reichart
Moshe Tennenholtz
298
13
0
07 Oct 2024
Evaluating and Enhancing LLMs Agent based on Theory of Mind in Guandan:
  A Multi-Player Cooperative Game under Imperfect Information
Evaluating and Enhancing LLMs Agent based on Theory of Mind in Guandan: A Multi-Player Cooperative Game under Imperfect Information
Yauwai Yim
Chunkit Chan
Tianyu Shi
Zheye Deng
Wei Fan
Tianshi Zheng
Yangqiu Song
LLMAG
243
19
0
05 Aug 2024
Knowledge Mechanisms in Large Language Models: A Survey and Perspective
Knowledge Mechanisms in Large Language Models: A Survey and Perspective
Meng Wang
Yunzhi Yao
Ziwen Xu
Shuofei Qiao
Shumin Deng
...
Yong Jiang
Pengjun Xie
Fei Huang
Huajun Chen
Ningyu Zhang
285
58
0
22 Jul 2024
Are Large Language Models Strategic Decision Makers? A Study of
  Performance and Bias in Two-Player Non-Zero-Sum Games
Are Large Language Models Strategic Decision Makers? A Study of Performance and Bias in Two-Player Non-Zero-Sum Games
Nathan Herr
Fernando Acero
Roberta Raileanu
María Pérez-Ortiz
Zhibin Li
LRM
257
4
0
05 Jul 2024
InterIntent: Investigating Social Intelligence of LLMs via Intention
  Understanding in an Interactive Game Context
InterIntent: Investigating Social Intelligence of LLMs via Intention Understanding in an Interactive Game Context
Ziyi Liu
Abhishek Anand
Pei Zhou
Jen-tse Huang
Jieyu Zhao
252
21
0
18 Jun 2024
LLM as a Mastermind: A Survey of Strategic Reasoning with Large Language
  Models
LLM as a Mastermind: A Survey of Strategic Reasoning with Large Language Models
Yadong Zhang
Shaoguang Mao
Tao Ge
Xun Wang
Adrian de Wynter
Yan Xia
Wenshan Wu
Ting Song
Man Lan
Furu Wei
LRM
320
99
0
01 Apr 2024
How Far Are We on the Decision-Making of LLMs? Evaluating LLMs' Gaming Ability in Multi-Agent Environments
How Far Are We on the Decision-Making of LLMs? Evaluating LLMs' Gaming Ability in Multi-Agent Environments
Shu Yang
E. Li
Man Ho Lam
Tian Liang
Wenxuan Wang
Youliang Yuan
Wenxiang Jiao
Xing Wang
Zhaopeng Tu
Michael R. Lyu
ELMLLMAG
422
52
0
18 Mar 2024
GTBench: Uncovering the Strategic Reasoning Limitations of LLMs via
  Game-Theoretic Evaluations
GTBench: Uncovering the Strategic Reasoning Limitations of LLMs via Game-Theoretic Evaluations
Jinhao Duan
Renming Zhang
James Diffenderfer
B. Kailkhura
Lichao Sun
Elias Stengel-Eskin
Mohit Bansal
Tianlong Chen
Kaidi Xu
ELMLRM
230
87
0
19 Feb 2024
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open
  Language Models
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
Zhihong Shao
Peiyi Wang
Qihao Zhu
Runxin Xu
Jun-Mei Song
...
Haowei Zhang
Mingchuan Zhang
Yiming Li
Yu-Huan Wu
Daya Guo
ReLMLRM
1.0K
3,433
0
05 Feb 2024
K-Level Reasoning with Large Language Models
K-Level Reasoning with Large Language Models
Yadong Zhang
Shaoguang Mao
Tao Ge
Xun Wang
Yan Xia
Man Lan
Furu Wei
LRMReLM
161
8
0
02 Feb 2024
Can Large Language Models Serve as Rational Players in Game Theory? A
  Systematic Analysis
Can Large Language Models Serve as Rational Players in Game Theory? A Systematic AnalysisAAAI Conference on Artificial Intelligence (AAAI), 2023
Caoyun Fan
Jindou Chen
Yaohui Jin
Hao He
176
93
0
09 Dec 2023
MAgIC: Investigation of Large Language Model Powered Multi-Agent in
  Cognition, Adaptability, Rationality and Collaboration
MAgIC: Investigation of Large Language Model Powered Multi-Agent in Cognition, Adaptability, Rationality and CollaborationConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Lin Xu
Zhiyuan Hu
Daquan Zhou
Hongyu Ren
Zhen Dong
Kurt Keutzer
See Kiong Ng
Jiashi Feng
LRMLLMAGELM
155
51
0
14 Nov 2023
Suspicion-Agent: Playing Imperfect Information Games with Theory of Mind
  Aware GPT-4
Suspicion-Agent: Playing Imperfect Information Games with Theory of Mind Aware GPT-4
Jiaxian Guo
Bo Yang
Paul D. Yoo
Bill Yuchen Lin
Yusuke Iwasawa
Yutaka Matsuo
LLMAG
265
59
0
29 Sep 2023
Strategic Behavior of Large Language Models: Game Structure vs.
  Contextual Framing
Strategic Behavior of Large Language Models: Game Structure vs. Contextual FramingSocial Science Research Network (SSRN), 2023
Nunzio Lorè
Babak Heydari
126
49
0
12 Sep 2023
Beyond Static Datasets: A Deep Interaction Approach to LLM Evaluation
Beyond Static Datasets: A Deep Interaction Approach to LLM Evaluation
Jiatong Li
Rui Li
Qi Liu
154
27
0
08 Sep 2023
Boosting Logical Reasoning in Large Language Models through a New
  Framework: The Graph of Thought
Boosting Logical Reasoning in Large Language Models through a New Framework: The Graph of Thought
Bin Lei
Pei-Hung Lin
C. Liao
Caiwen Ding
ReLMELMLRMAI4CE
146
46
0
16 Aug 2023
Exploring the MIT Mathematics and EECS Curriculum Using Large Language Models
Sarah J. Zhang
Samuel H. Florin
Ariel N. Lee
Eamon Niknafs
Andrei Marginean
...
Madeleine Udell
Yoon Kim
Tonio Buonassisi
Armando Solar-Lezama
Iddo Drori
ELM
147
20
0
15 Jun 2023
Strategic Reasoning with Language Models
Strategic Reasoning with Language Models
Kanishk Gandhi
Dorsa Sadigh
Noah D. Goodman
LM&RoLRM
134
53
0
30 May 2023
Playing repeated games with Large Language Models
Playing repeated games with Large Language ModelsNature Human Behaviour (Nat Hum Behav), 2023
Elif Akata
Lion Schulz
Julian Coda-Forno
Seong Joon Oh
Matthias Bethge
Eric Schulz
1.0K
183
0
26 May 2023
Large Language Models as Commonsense Knowledge for Large-Scale Task
  Planning
Large Language Models as Commonsense Knowledge for Large-Scale Task PlanningNeural Information Processing Systems (NeurIPS), 2023
Zirui Zhao
W. Lee
David Hsu
LRMLLMAGLM&Ro
330
311
0
23 May 2023
The Machine Psychology of Cooperation: Can GPT models operationalise
  prompts for altruism, cooperation, competitiveness and selfishness in
  economic games?
The Machine Psychology of Cooperation: Can GPT models operationalise prompts for altruism, cooperation, competitiveness and selfishness in economic games?
S. Phelps
Y. Russell
236
24
0
13 May 2023
Causal Reasoning and Large Language Models: Opening a New Frontier for
  Causality
Causal Reasoning and Large Language Models: Opening a New Frontier for Causality
Emre Kıcıman
Robert Osazuwa Ness
Amit Sharma
Chenhao Tan
LRMELM
428
371
0
28 Apr 2023
ART: Automatic multi-step reasoning and tool-use for large language
  models
ART: Automatic multi-step reasoning and tool-use for large language models
Bhargavi Paranjape
Scott M. Lundberg
Sameer Singh
Hannaneh Hajishirzi
Luke Zettlemoyer
Marco Tulio Ribeiro
KELMReLMLRM
259
187
0
16 Mar 2023
MathPrompter: Mathematical Reasoning using Large Language Models
MathPrompter: Mathematical Reasoning using Large Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Shima Imani
Liang Du
H. Shrivastava
KELMReLMLRM
198
259
0
04 Mar 2023
Large Language Models Fail on Trivial Alterations to Theory-of-Mind
  Tasks
Large Language Models Fail on Trivial Alterations to Theory-of-Mind Tasks
T. Ullman
LRM
281
291
0
16 Feb 2023
Towards Reasoning in Large Language Models: A Survey
Towards Reasoning in Large Language Models: A SurveyAnnual Meeting of the Association for Computational Linguistics (ACL), 2022
Jie Huang
Kevin Chen-Chuan Chang
LM&MAELMLRM
756
787
0
20 Dec 2022
Using Large Language Models to Simulate Multiple Humans and Replicate
  Human Subject Studies
Using Large Language Models to Simulate Multiple Humans and Replicate Human Subject StudiesInternational Conference on Machine Learning (ICML), 2022
Gati Aher
RosaI. Arriaga
Adam Tauman Kalai
479
530
0
18 Aug 2022
Inner Monologue: Embodied Reasoning through Planning with Language
  Models
Inner Monologue: Embodied Reasoning through Planning with Language ModelsConference on Robot Learning (CoRL), 2022
Wenlong Huang
F. Xia
Ted Xiao
Harris Chan
Jacky Liang
...
Tomas Jackson
Linda Luu
Sergey Levine
Karol Hausman
Brian Ichter
LLMAGLM&RoLRM
316
1,132
0
12 Jul 2022
Solving Quantitative Reasoning Problems with Language Models
Solving Quantitative Reasoning Problems with Language ModelsNeural Information Processing Systems (NeurIPS), 2022
Aitor Lewkowycz
Anders Andreassen
David Dohan
Ethan Dyer
Henryk Michalewski
...
Theo Gutman-Solo
Yuhuai Wu
Behnam Neyshabur
Guy Gur-Ari
Vedant Misra
ReLMELMLRM
584
1,249
0
29 Jun 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Chain-of-Thought Prompting Elicits Reasoning in Large Language ModelsNeural Information Processing Systems (NeurIPS), 2022
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&RoLRMAI4CEReLM
2.1K
13,906
0
28 Jan 2022
Recent Advances in Natural Language Processing via Large Pre-Trained
  Language Models: A Survey
Recent Advances in Natural Language Processing via Large Pre-Trained Language Models: A SurveyACM Computing Surveys (CSUR), 2021
Bonan Min
Hayley L Ross
Elior Sulem
Amir Pouran Ben Veyseh
Thien Huu Nguyen
Oscar Sainz
Eneko Agirre
Ilana Heinz
Dan Roth
LM&MAVLMAI4CE
354
1,322
0
01 Nov 2021
Measuring Mathematical Problem Solving With the MATH Dataset
Measuring Mathematical Problem Solving With the MATH Dataset
Dan Hendrycks
Collin Burns
Saurav Kadavath
Akul Arora
Steven Basart
Eric Tang
Basel Alomair
Jacob Steinhardt
ReLMFaML
699
3,710
0
05 Mar 2021
PIQA: Reasoning about Physical Commonsense in Natural Language
PIQA: Reasoning about Physical Commonsense in Natural LanguageAAAI Conference on Artificial Intelligence (AAAI), 2019
Yonatan Bisk
Rowan Zellers
Ronan Le Bras
Jianfeng Gao
Yejin Choi
OODLRM
1.1K
2,415
0
26 Nov 2019
1