Utilizing Novelty-based Evolution Strategies to Train Transformers in Reinforcement Learning

10 February 2025

Abstract

In this paper, we experiment with novelty-based variants of OpenAI-ES, the NS-ES and NSR-ES algorithms, and evaluate their effectiveness in training complex, transformer-based architectures designed for the problem of reinforcement learning such as Decision Transformers. We also test if we can accelerate the novelty-based training of these larger models by seeding the training by a pretrained models. By this, we build on our previous work, where we tested the ability of evolution strategies - specifically the aforementioned OpenAI-ES - to train the Decision Transformer architecture. The results were mixed. NS-ES showed progress, but it would clearly need many more iterations for it to yield interesting results. NSR-ES, on the other hand, proved quite capable of being straightforwardly used on larger models, since its performance appears as similar between the feed-forward model and Decision Transformer, as it was for the OpenAI-ES in our previous work.

View on arXiv

@article{lorenc2025_2502.06301,
  title={ Utilizing Novelty-based Evolution Strategies to Train Transformers in Reinforcement Learning },
  author={ Matyáš Lorenc },
  journal={arXiv preprint arXiv:2502.06301},
  year={ 2025 }
}

Comments on this paper