We propose a simple approach which, given distributed computing resources, can nearly achieve the accuracy of -NN prediction, while matching (or improving) the faster prediction time of -NN. The approach consists of aggregating denoised -NN predictors over a small number of distributed subsamples. We show, both theoretically and experimentally, that small subsample sizes suffice to attain similar performance as -NN, without sacrificing the computational efficiency of -NN.
View on arXiv