ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2209.02570
218
28
v1v2 (latest)

When Privacy Meets Partial Information: A Refined Analysis of Differentially Private Bandits

Neural Information Processing Systems (NeurIPS), 2022
6 September 2022
Achraf Azize
D. Basu
ArXiv (abs)PDFHTML
Abstract

We study the problem of multi-armed bandits with ϵ\epsilonϵ-global Differential Privacy (DP). First, we prove the minimax and problem-dependent regret lower bounds for stochastic and linear bandits that quantify the hardness of bandits with ϵ\epsilonϵ-global DP. These bounds suggest the existence of two hardness regimes depending on the privacy budget ϵ\epsilonϵ. In the high-privacy regime (small ϵ\epsilonϵ), the hardness depends on a coupled effect of privacy and partial information about the reward distributions. In the low-privacy regime (large ϵ\epsilonϵ), bandits with ϵ\epsilonϵ-global DP are not harder than the bandits without privacy. For stochastic bandits, we further propose a generic framework to design a near-optimal ϵ\epsilonϵ global DP extension of an index-based optimistic bandit algorithm. The framework consists of three ingredients: the Laplace mechanism, arm-dependent adaptive episodes, and usage of only the rewards collected in the last episode for computing private statistics. Specifically, we instantiate ϵ\epsilonϵ-global DP extensions of UCB and KL-UCB algorithms, namely AdaP-UCB and AdaP-KLUCB. AdaP-KLUCB is the first algorithm that both satisfies ϵ\epsilonϵ-global DP and yields a regret upper bound that matches the problem-dependent lower bound up to multiplicative constants.

View on arXiv
Comments on this paper