58
0

A Refined Analysis of UCBVI

Abstract

In this work, we provide a refined analysis of the UCBVI algorithm (Azar et al., 2017), improving both the bonus terms and the regret analysis. Additionally, we compare our version of UCBVI with both its original version and the state-of-the-art MVP algorithm. Our empirical validation demonstrates that improving the multiplicative constants in the bounds has significant positive effects on the empirical performance of the algorithms.

View on arXiv
@article{drago2025_2502.17370,
  title={ A Refined Analysis of UCBVI },
  author={ Simone Drago and Marco Mussi and Alberto Maria Metelli },
  journal={arXiv preprint arXiv:2502.17370},
  year={ 2025 }
}
Comments on this paper