57
0

Faster Rates for Private Adversarial Bandits

Main:12 Pages
1 Figures
Bibliography:3 Pages
2 Tables
Appendix:27 Pages
Abstract

We design new differentially private algorithms for the problems of adversarial bandits and bandits with expert advice. For adversarial bandits, we give a simple and efficient conversion of any non-private bandit algorithm to a private bandit algorithm. Instantiating our conversion with existing non-private bandit algorithms gives a regret upper bound of O(KTϵ)O\left(\frac{\sqrt{KT}}{\sqrt{\epsilon}}\right), improving upon the existing upper bound O(KTlog(KT)ϵ)O\left(\frac{\sqrt{KT \log(KT)}}{\epsilon}\right) for all ϵ1\epsilon \leq 1. In particular, our algorithms allow for sublinear expected regret even when ϵ1T\epsilon \leq \frac{1}{\sqrt{T}}, establishing the first known separation between central and local differential privacy for this problem. For bandits with expert advice, we give the first differentially private algorithms, with expected regret O(NTϵ),O(KTlog(N)log(KT)ϵ)O\left(\frac{\sqrt{NT}}{\sqrt{\epsilon}}\right), O\left(\frac{\sqrt{KT\log(N)}\log(KT)}{\epsilon}\right), and O~(N1/6K1/2T2/3log(NT)ϵ1/3+N1/2log(NT)ϵ)\tilde{O}\left(\frac{N^{1/6}K^{1/2}T^{2/3}\log(NT)}{\epsilon ^{1/3}} + \frac{N^{1/2}\log(NT)}{\epsilon}\right), where KK and NN are the number of actions and experts respectively. These rates allow us to get sublinear regret for different combinations of small and large K,NK, N and ϵ.\epsilon.

View on arXiv
@article{asi2025_2505.21790,
  title={ Faster Rates for Private Adversarial Bandits },
  author={ Hilal Asi and Vinod Raman and Kunal Talwar },
  journal={arXiv preprint arXiv:2505.21790},
  year={ 2025 }
}
Comments on this paper