333

Improved Confidence Bounds for the Linear Logistic Model and Applications to Linear Bandits

International Conference on Machine Learning (ICML), 2020
Abstract

We propose improved fixed-design confidence bounds for the linear logistic model. Our bounds significantly improve upon the state-of-the-art bounds of Li et al. (2017) by leveraging the self-concordance of the logistic loss inspired by Faury et al. (2020). Specifically, our confidence width does not scale with the problem dependent parameter 1/κ1/\kappa, where κ\kappa is the worst-case variance of an arm reward. At worse, κ\kappa scales exponentially with the norm of the unknown linear parameter θ\theta^*. Instead, our bound scales directly on the local variance induced by θ\theta^*. We present two applications of our novel bounds on two logistic bandit problems: regret minimization and pure exploration. Our analysis shows that the new confidence bounds improve upon previous state-of-the-art performance guarantees.

View on arXiv
Comments on this paper