ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1810.08164
83
10
v1v2v3v4v5v6v7 (latest)

Exploiting Correlation in Finite-Armed Structured Bandits

18 October 2018
Samarth Gupta
Shreyas Chaudhari
Subhojyoti Mukherjee
Gauri Joshi
ArXiv (abs)PDFHTML
Abstract

We consider a structured Multi-Armed bandit problem in which mean rewards of different arms are related through a hidden parameter. We propose an approach that allows generalization of classical bandit algorithms such as UCB and Thompson sampling to the structured bandit setting. Our approach is based on exploiting the structure in the problem to identify some arms as sub-optimal and pulling them only O(1) times. This results in significant reduction in cumulative regret and in fact our algorithm achieves bounded (i.e., O(1)) regret whenever possible. We empirically demonstrate the superiority of our algorithms via simulations and experiments on the Movielens dataset. Moreover, the problem setting we study in this paper subsumes several previously studied framework such as Global, Regional and Structured bandits with linear rewards.

View on arXiv
Comments on this paper