Multimodal Bandits: Regret Lower Bounds and Optimal Algorithms

29 October 2025

William Réveillard

Richard Combes

ArXiv (abs)PDF HTML Github

Main:9 Pages

6 Figures

Bibliography:2 Pages

Appendix:20 Pages

Abstract

We consider a stochastic multi-armed bandit problem with i.i.d. rewards where the expected reward function is multimodal with at most m modes. We propose the first known computationally tractable algorithm for computing the solution to the Graves-Lai optimization problem, which in turn enables the implementation of asymptotically optimal algorithms for this bandit problem. The code for the proposed algorithms is publicly available atthis https URL

View on arXiv

Comments on this paper