ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2106.02900
11
28

Differentially Private Multi-Armed Bandits in the Shuffle Model

5 June 2021
J. Tenenbaum
Haim Kaplan
Yishay Mansour
Uri Stemmer
    FedML
ArXivPDFHTML
Abstract

We give an (ε,δ)(\varepsilon,\delta)(ε,δ)-differentially private algorithm for the multi-armed bandit (MAB) problem in the shuffle model with a distribution-dependent regret of O((∑a∈[k]:Δa>0log⁡TΔa)+klog⁡1δlog⁡Tε)O\left(\left(\sum_{a\in [k]:\Delta_a>0}\frac{\log T}{\Delta_a}\right)+\frac{k\sqrt{\log\frac{1}{\delta}}\log T}{\varepsilon}\right)O((∑a∈[k]:Δa​>0​Δa​logT​)+εklogδ1​​logT​), and a distribution-independent regret of O(kTlog⁡T+klog⁡1δlog⁡Tε)O\left(\sqrt{kT\log T}+\frac{k\sqrt{\log\frac{1}{\delta}}\log T}{\varepsilon}\right)O(kTlogT​+εklogδ1​​logT​), where TTT is the number of rounds, Δa\Delta_aΔa​ is the suboptimality gap of the arm aaa, and kkk is the total number of arms. Our upper bound almost matches the regret of the best known algorithms for the centralized model, and significantly outperforms the best known algorithm in the local model.

View on arXiv
Comments on this paper