Drama: Mamba-Enabled Model-Based Reinforcement Learning Is Sample and Parameter Efficient

International Conference on Learning Representations (ICLR), 2024

11 October 2024

Main:10 Pages

12 Figures

Bibliography:4 Pages

8 Tables

Appendix:14 Pages

Abstract

Model-based reinforcement learning (RL) offers a solution to the data inefficiency that plagues most model-free RL algorithms. However, learning a robust world model often requires complex and deep architectures, which are computationally expensive and challenging to train. Within the world model, sequence models play a critical role in accurate predictions, and various architectures have been explored, each with its own challenges. Currently, recurrent neural network (RNN)-based world models struggle with vanishing gradients and capturing long-term dependencies. Transformers, on the other hand, suffer from the quadratic memory and computational complexity of self-attention mechanisms, scaling as $O(n^2)$ , where $n$ is the sequence length.

View on arXiv

Comments on this paper