279

State-free Reinforcement Learning

Neural Information Processing Systems (NeurIPS), 2024
Main:9 Pages
1 Figures
Bibliography:3 Pages
Appendix:11 Pages
Abstract

In this work, we study the \textit{state-free RL} problem, where the algorithm does not have the states information before interacting with the environment. Specifically, denote the reachable state set by SΠ:={smaxπΠqP,π(s)>0}{S}^\Pi := \{ s|\max_{\pi\in \Pi}q^{P, \pi}(s)>0 \}, we design an algorithm which requires no information on the state space SS while having a regret that is completely independent of S{S} and only depend on SΠ{S}^\Pi. We view this as a concrete first step towards \textit{parameter-free RL}, with the goal of designing RL algorithms that require no hyper-parameter tuning.

View on arXiv
Comments on this paper