95

MDP Geometry, Normalization and Value Free Solvers

Main:7 Pages
13 Figures
Bibliography:2 Pages
Appendix:22 Pages
Abstract

The Markov Decision Process (MDP) is a widely used mathematical model for sequential decision-making problems. In this paper, we present a new geometric interpretation of MDPs. Based on this interpretation, we show that MDPs can be divided into equivalence classes with indistinguishable key solving algorithms dynamics. This related normalization procedure enables the development of a novel class of MDP-solving algorithms that find optimal policies without computing policy values. The new algorithms we propose for different settings achieve and, in some cases, improve upon state-of-the-art results.

View on arXiv
Comments on this paper