Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2407.13943
Cited By
Werewolf Arena: A Case Study in LLM Evaluation via Social Deduction
18 July 2024
Suma Bailis
Jane Friedhoff
Feiyang Chen
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Werewolf Arena: A Case Study in LLM Evaluation via Social Deduction"
2 / 2 papers shown
Title
DSGBench: A Diverse Strategic Game Benchmark for Evaluating LLM-based Agents in Complex Decision-Making Environments
Wenjie Tang
Yuan Zhou
Erqiang Xu
Keyan Cheng
Minne Li
Liquan Xiao
ELM
47
1
0
08 Mar 2025
Simulating Human Strategic Behavior: Comparing Single and Multi-agent LLMs
Karthik Sreedhar
Lydia B. Chilton
LLMAG
46
12
0
13 Feb 2024
1