Reducing Nearest Neighbor Training Sets Optimally and Exactly

4 February 2023

Abstract

In nearest-neighbor classification, a training set $P$ of points in $\mathbb{R}^d$ with given classification is used to classify every point in $\mathbb{R}^d$ : Every point gets the same classification as its nearest neighbor in $P$ . Recently, Eppstein [SOSA'22] developed an algorithm to detect the relevant training points, those points $p\in P$ , such that $P$ and $P\setminus\{p\}$ induce different classifications. We investigate the problem of finding the minimum cardinality reduced training set $P'\subseteq P$ such that $P$ and $P'$ induce the same classification. We show that the set of relevant points is such a minimum cardinality reduced training set if $P$ is in general position. Furthermore, we show that finding a minimum cardinality reduced training set for possibly degenerate $P$ is in P for $d=1$ , and NP-complete for $d\geq 2$ .

View on arXiv

Comments on this paper