This paper studies \emph{differential privacy (DP)} and \emph{local differential privacy (LDP)} in cascading bandits. Under DP, we propose an algorithm which guarantees -indistinguishability and a regret of for an arbitrarily small . This is a significant improvement from the previous work of regret. Under (,)-LDP, we relax the dependence through the tradeoff between privacy budget and error probability , and obtain a regret of , where is the size of the arm subset. This result holds for both Gaussian mechanism and Laplace mechanism by analyses on the composition. Our results extend to combinatorial semi-bandit. We show respective lower bounds for DP and LDP cascading bandits. Extensive experiments corroborate our theoretic findings.
View on arXiv