21
0

Robust Q-Learning for finite ambiguity sets

Abstract

In this paper we propose a novel QQ-learning algorithm allowing to solve distributionally robust Markov decision problems for which the ambiguity set of probability measures can be chosen arbitrarily as long as it comprises only a finite amount of measures. Therefore, our approach goes beyond the well-studied cases involving ambiguity sets of balls around some reference measure with the distance to reference measure being measured with respect to the Wasserstein distance or the Kullback--Leibler divergence. Hence, our approach allows the applicant to create ambiguity sets better tailored to her needs and to solve the associated robust Markov decision problem via a QQ-learning algorithm whose convergence is guaranteed by our main result. Moreover, we showcase in several numerical experiments the tractability of our approach.

View on arXiv
@article{decker2025_2407.04259,
  title={ Robust Q-Learning for finite ambiguity sets },
  author={ Cécile Decker and Julian Sester },
  journal={arXiv preprint arXiv:2407.04259},
  year={ 2025 }
}
Comments on this paper