ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1702.07539
166
22

Tight Bounds for Bandit Combinatorial Optimization

24 February 2017
Alon Cohen
Tamir Hazan
Tomer Koren
ArXiv (abs)PDFHTML
Abstract

We revisit the study of optimal regret rates in bandit combinatorial optimization---a fundamental framework for sequential decision making under uncertainty that abstracts numerous combinatorial prediction problems. We prove that the attainable regret in this setting grows as Θ~(k3/2dT)\widetilde{\Theta}(k^{3/2}\sqrt{dT})Θ(k3/2dT​) where ddd is the dimension of the problem and kkk is a bound over the maximal instantaneous loss, disproving a conjecture of Audibert, Bubeck, and Lugosi (2013) who argued that the optimal rate should be of the form Θ~(kdT)\widetilde{\Theta}(k\sqrt{dT})Θ(kdT​). Our bounds apply to several important instances of the framework, and in particular, imply a tight bound for the well-studied bandit shortest path problem. By that, we also resolve an open problem posed by Cesa-Bianchi and Lugosi (2012).

View on arXiv
Comments on this paper