ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2502.10906
26
1

PCGRLLM: Large Language Model-Driven Reward Design for Procedural Content Generation Reinforcement Learning

15 February 2025
In-Chang Baek
Sung-Hyun Kim
Sam Earle
Zehua Jiang
Noh Jin-Ha
Julian Togelius
Kyung-Joong Kim
ArXivPDFHTML
Abstract

Reward design plays a pivotal role in the training of game AIs, requiring substantial domain-specific knowledge and human effort. In recent years, several studies have explored reward generation for training game agents and controlling robots using large language models (LLMs). In the content generation literature, there has been early work on generating reward functions for reinforcement learning agent generators. This work introduces PCGRLLM, an extended architecture based on earlier work, which employs a feedback mechanism and several reasoning-based prompt engineering techniques. We evaluate the proposed method on a story-to-reward generation task in a two-dimensional environment using two state-of-the-art LLMs, demonstrating the generalizability of our approach. Our experiments provide insightful evaluations that demonstrate the capabilities of LLMs essential for content generation tasks. The results highlight significant performance improvements of 415% and 40% respectively, depending on the zero-shot capabilities of the language model. Our work demonstrates the potential to reduce human dependency in game AI development, while supporting and enhancing creative processes.

View on arXiv
@article{baek2025_2502.10906,
  title={ PCGRLLM: Large Language Model-Driven Reward Design for Procedural Content Generation Reinforcement Learning },
  author={ In-Chang Baek and Sung-Hyun Kim and Sam Earle and Zehua Jiang and Noh Jin-Ha and Julian Togelius and Kyung-Joong Kim },
  journal={arXiv preprint arXiv:2502.10906},
  year={ 2025 }
}
Comments on this paper