ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1808.05904
75
22
v1v2 (latest)

Correlated Multi-armed Bandits with a Latent Random Source

17 August 2018
Samarth Gupta
Gauri Joshi
Osman Yağan
ArXiv (abs)PDFHTML
Abstract

We consider a novel multi-armed bandit framework where the rewards obtained by pulling the arms are functions of a common latent random variable. The correlation between arms due to the common random source can be used to design a generalized upper-confidence-bound (UCB) algorithm that identifies certain arms as non−competitivenon-competitivenon−competitive, and avoids exploring them. As a result, we reduce a KKK-armed bandit problem to a C+1C+1C+1-armed problem, where C+1C+1C+1 includes the best arm and CCC competitivecompetitivecompetitive arms. Our regret analysis shows that the competitive arms need to be pulled O(log⁡T)\mathcal{O}(\log T)O(logT) times, while the non-competitive arms are pulled only O(1)\mathcal{O}(1)O(1) times. As a result, there are regimes where our algorithm achieves a O(1)\mathcal{O}(1)O(1) regret as opposed to the typical logarithmic regret scaling of multi-armed bandit algorithms. We also evaluate lower bounds on the expected regret and prove that our correlated-UCB algorithm is order-wise optimal.

View on arXiv
Comments on this paper