13

Sharp analysis of linear ensemble sampling

Arya Akhavan
David Janz
Csaba Szepesvári
Main:13 Pages
1 Figures
Bibliography:3 Pages
1 Tables
Appendix:17 Pages
Abstract

We analyse linear ensemble sampling (ES) with standard Gaussian perturbations in stochastic linear bandits. We show that for ensemble size m=Θ(dlogn)m=\Theta(d\log n), ES attains O~(d3/2n)\tilde O(d^{3/2}\sqrt n) high-probability regret, closing the gap to the Thompson sampling benchmark while keeping computation comparable. The proof brings a new perspective on randomized exploration in linear bandits by reducing the analysis to a time-uniform exceedance problem for mm independent Brownian motions. Intriguingly, this continuous-time lens is not forced; it appears natural--and perhaps necessary: the discrete-time problem seems to be asking for a continuous-time solution, and we know of no other way to obtain a sharp ES bound.

View on arXiv
Comments on this paper