ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1907.05632
56
17
v1v2v3 (latest)

Laplacian-regularized graph bandits: Algorithms and theoretical analysis

12 July 2019
Kaige Yang
Xiaowen Dong
Laura Toni
ArXiv (abs)PDFHTML
Abstract

We consider a stochastic linear bandit problem with multiple users, where the relationship between users is captured by an underlying graph and user preferences are represented as smooth signals on the graph. We introduce a novel bandit algorithm where the smoothness prior is imposed via the random-walk graph Laplacian, which leads to a single-user cumulative regret scaling as O~(ΨdT)\tilde{\mathcal{O}}(\Psi d \sqrt{T})O~(ΨdT​) with time horizon TTT, feature dimensionality ddd, and the scalar parameter Ψ∈(0,1)\Psi \in (0,1)Ψ∈(0,1) that depends on the graph connectivity. This is an improvement over O~(dT)\tilde{\mathcal{O}}(d \sqrt{T})O~(dT​) in \algo{LinUCB}~\Ccite{li2010contextual}, where user relationship is not taken into account. In terms of network regret (sum of cumulative regret over nnn users), the proposed algorithm leads to a scaling as O~(ΨdnT)\tilde{\mathcal{O}}(\Psi d\sqrt{nT})O~(ΨdnT​), which is a significant improvement over O~(ndT)\tilde{\mathcal{O}}(nd\sqrt{T})O~(ndT​) in the state-of-the-art algorithm \algo{Gob.Lin} \Ccite{cesa2013gang}. To improve scalability, we further propose a simplified algorithm with a linear computational complexity with respect to the number of users, while maintaining the same regret. Finally, we present a finite-time analysis on the proposed algorithms, and demonstrate their advantage in comparison with state-of-the-art graph-based bandit algorithms on both synthetic and real-world data.

View on arXiv
Comments on this paper