110
0

On rate-optimal classification from non-private and from private data

Balázs Csanád Csáji
Abstract

In this paper we revisit the classical problem of classification, but impose privacy constraints. Under such constraints, the raw data (X1,Y1),,(Xn,Yn)(X_1,Y_1),\ldots,(X_n,Y_n) cannot be directly observed, and all classifiers are functions of the randomised outcome of a suitable local differential privacy mechanism. The statistician is free to choose the form of this privacy mechanism, and here we add Laplace distributed noise to a discretisation of the location of each feature vector XiX_i and to its label YiY_i. The classification rule is the privatized version of the well-studied partitioning classification rule. In addition to the standard Lipschitz and margin conditions, a novel characteristic is introduced, by which the exact rate of convergence of the classification error probability is calculated, both for non-private and private data.

View on arXiv
Comments on this paper