ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2411.13543
  4. Cited By
BALROG: Benchmarking Agentic LLM and VLM Reasoning On Games

BALROG: Benchmarking Agentic LLM and VLM Reasoning On Games

20 November 2024
Davide Paglieri
Bartłomiej Cupiał
Samuel Coward
Ulyana Piterbarg
Maciej Wolczyk
Akbir Khan
Eduardo Pignatelli
Łukasz Kuciński
Lerrel Pinto
Rob Fergus
Jakob Foerster
Jack Parker-Holder
Tim Rocktaschel
    LLMAG
    LRM
ArXivPDFHTML

Papers citing "BALROG: Benchmarking Agentic LLM and VLM Reasoning On Games"

6 / 6 papers shown
Title
Benchmarking LLMs' Swarm intelligence
Benchmarking LLMs' Swarm intelligence
Kai Ruan
Mowen Huang
Ji-Rong Wen
Hao Sun
33
0
0
07 May 2025
Playpen: An Environment for Exploring Learning Through Conversational Interaction
Playpen: An Environment for Exploring Learning Through Conversational Interaction
Nicola Horst
Davide Mazzaccara
Antonia Schmidt
Michael Sullivan
Filippo Momentè
...
Alexander Koller
Oliver Lemon
David Schlangen
Mario Giulianelli
Alessandro Suglia
OffRL
27
0
0
11 Apr 2025
Fine-Tuning Diffusion Generative Models via Rich Preference Optimization
Fine-Tuning Diffusion Generative Models via Rich Preference Optimization
Hanyang Zhao
Haoxian Chen
Yucheng Guo
Genta Indra Winata
Tingting Ou
Ziyu Huang
D. Yao
Wenpin Tang
50
0
0
13 Mar 2025
Factorio Learning Environment
Jack Hopkins
Mart Bakler
Akbir Khan
LRM
AI4CE
LLMAG
47
0
0
06 Mar 2025
MLGym: A New Framework and Benchmark for Advancing AI Research Agents
MLGym: A New Framework and Benchmark for Advancing AI Research Agents
Deepak Nathani
Lovish Madaan
Nicholas Roberts
Nikolay Bashlykov
Ajay Menon
...
Tatiana Shavrina
Jakob Foerster
Yoram Bachrach
William Yang Wang
Roberta Raileanu
LLMAG
72
7
0
21 Feb 2025
Harnessing Language for Coordination: A Framework and Benchmark for LLM-Driven Multi-Agent Control
Harnessing Language for Coordination: A Framework and Benchmark for LLM-Driven Multi-Agent Control
Timothée Anne
Noah Syrkis
Meriem Elhosni
Florian Turati
Franck Legendre
Alain Jaquier
Sebastian Risi
LLMAG
85
1
0
16 Dec 2024
1