ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2402.12348
  4. Cited By
GTBench: Uncovering the Strategic Reasoning Limitations of LLMs via
  Game-Theoretic Evaluations

GTBench: Uncovering the Strategic Reasoning Limitations of LLMs via Game-Theoretic Evaluations

19 February 2024
Jinhao Duan
Renming Zhang
James Diffenderfer
B. Kailkhura
Lichao Sun
Elias Stengel-Eskin
Mohit Bansal
Tianlong Chen
Kaidi Xu
    ELM
    LRM
ArXivPDFHTML

Papers citing "GTBench: Uncovering the Strategic Reasoning Limitations of LLMs via Game-Theoretic Evaluations"

11 / 11 papers shown
Title
Humanizing LLMs: A Survey of Psychological Measurements with Tools, Datasets, and Human-Agent Applications
Humanizing LLMs: A Survey of Psychological Measurements with Tools, Datasets, and Human-Agent Applications
Wenhan Dong
Yuemeng Zhao
Zhen Sun
Yule Liu
Zifan Peng
...
Jun Wu
Ruiming Wang
Shengmin Xu
Xinyi Huang
Xinlei He
LLMAG
55
0
0
30 Apr 2025
ZeroSumEval: An Extensible Framework For Scaling LLM Evaluation with Inter-Model Competition
ZeroSumEval: An Extensible Framework For Scaling LLM Evaluation with Inter-Model Competition
H. A. Alyahya
Haidar Khan
Yazeed Alnumay
M Saiful Bari
B. Yener
LRM
58
0
0
10 Mar 2025
Reasoning and the Trusting Behavior of DeepSeek and GPT: An Experiment Revealing Hidden Fault Lines in Large Language Models
Reasoning and the Trusting Behavior of DeepSeek and GPT: An Experiment Revealing Hidden Fault Lines in Large Language Models
Rubing Li
João Sedoc
Arun Sundararajan
LRM
53
0
0
20 Feb 2025
Moral Alignment for LLM Agents
Moral Alignment for LLM Agents
Elizaveta Tennant
Stephen Hailes
Mirco Musolesi
35
0
0
02 Oct 2024
Instigating Cooperation among LLM Agents Using Adaptive Information
  Modulation
Instigating Cooperation among LLM Agents Using Adaptive Information Modulation
Qiliang Chen
Sepehr Ilami
Nunzio Lore
Babak Heydari
29
1
0
16 Sep 2024
How Far Are We on the Decision-Making of LLMs? Evaluating LLMs' Gaming Ability in Multi-Agent Environments
How Far Are We on the Decision-Making of LLMs? Evaluating LLMs' Gaming Ability in Multi-Agent Environments
Jen-tse Huang
E. Li
Man Ho Lam
Tian Liang
Wenxuan Wang
Youliang Yuan
Wenxiang Jiao
Xing Wang
Zhaopeng Tu
Michael R. Lyu
ELM
LLMAG
77
32
0
18 Mar 2024
DeepSeek LLM: Scaling Open-Source Language Models with Longtermism
DeepSeek LLM: Scaling Open-Source Language Models with Longtermism
DeepSeek-AI Xiao Bi
:
Xiao Bi
Deli Chen
Guanting Chen
...
Yao Zhao
Shangyan Zhou
Shunfeng Zhou
Qihao Zhu
Yuheng Zou
LRM
ALM
139
298
0
05 Jan 2024
MINT: Evaluating LLMs in Multi-turn Interaction with Tools and Language
  Feedback
MINT: Evaluating LLMs in Multi-turn Interaction with Tools and Language Feedback
Xingyao Wang
Zihan Wang
Jiateng Liu
Yangyi Chen
Lifan Yuan
Hao Peng
Heng Ji
LRM
125
137
0
19 Sep 2023
FIREBALL: A Dataset of Dungeons and Dragons Actual-Play with Structured
  Game State Information
FIREBALL: A Dataset of Dungeons and Dragons Actual-Play with Structured Game State Information
Andrew Zhu
Karmanya Aggarwal
Alexander H. Feng
Lara J. Martin
Chris Callison-Burch
24
17
0
02 May 2023
Self-Consistency Improves Chain of Thought Reasoning in Language Models
Self-Consistency Improves Chain of Thought Reasoning in Language Models
Xuezhi Wang
Jason W. Wei
Dale Schuurmans
Quoc Le
Ed H. Chi
Sharan Narang
Aakanksha Chowdhery
Denny Zhou
ReLM
BDL
LRM
AI4CE
297
3,163
0
21 Mar 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
315
8,261
0
28 Jan 2022
1