Achieving differential privacy for -nearest neighbors based outlier
detection by data partitioning

When applying outlier detection in settings where data is sensitive, mechanisms which guarantee the privacy of the underlying data are needed. The -nearest neighbors (-NN) algorithm is a simple and one of the most effective methods for outlier detection. So far, there have been no attempts made to develop a differentially private (-DP) approach for -NN based outlier detection. Existing approaches often relax the notion of -DP and employ other methods than -NN. We propose a method for -NN based outlier detection by separating the procedure into a fitting step on reference inlier data and then apply the outlier classifier to new data. We achieve -DP for both the fitting algorithm and the outlier classifier with respect to the reference data by partitioning the dataset into a uniform grid, which yields low global sensitivity. Our approach yields nearly optimal performance on real-world data with varying dimensions when compared to the non-private versions of -NN.
View on arXiv