Validation of k-Nearest Neighbor Classifiers Using Inclusion and Exclusion

9 October 2014

Abstract

This paper presents a series of PAC error bounds for $k$ -nearest neighbors classifiers, with O( $n^{-\frac{r}{2r+1}}$ ) expected range in the difference between error bound and actual error rate, for each integer $r>0$ , where $n$ is the number of in-sample examples. The best previous expected bound range was O( $n^{-\frac{2}{5}}$ ). The result shows that $k$ -nn classifiers, in spite of their famously fractured decision boundaries, come arbitrarily close to having Gaussian-style O( $n^{-\frac{1}{2}}$ ) expected differences between PAC (probably approximately correct) error bounds and actual expected out-of-sample error rates.

View on arXiv

Comments on this paper