PAC-Bayes-Bernstein Inequality for Martingales and its Application to Multiarmed Bandits

31 October 2011

Yevgeny Seldin

Abstract

We combine PAC-Bayesian analysis with a Bernstein-type inequality for martingales to obtain a result that makes it possible to control the concentration of multiple (possibly uncountably many) simultaneously evolving and interdependent martingales. We apply this result to derive a regret bound for the multiarmed bandit problem. Our result forms a basis for integrative simultaneous analysis of exploration-exploitation and model order selection trade-offs. It also opens a way for applying PAC-Bayesian analysis in other fields, where sequentially dependent samples and limited feedback are encountered.

View on arXiv

Comments on this paper

All Papers

0 / 0 papers shown

Title