177
v1v2v3v4 (latest)

Shapley-Based Data Valuation with Mutual Information: A Key to Modified K-Nearest Neighbors

Main:3 Pages
5 Figures
6 Tables
Appendix:3 Pages
Abstract

The K-Nearest Neighbors (KNN) algorithm is widely used for classification and regression; however, it suffers from limitations, including the equal treatment of all samples. We propose Information-Modified KNN (IM-KNN), a novel approach that leverages Mutual Information (II) and Shapley values to assign weighted values to neighbors, thereby bridging the gap in treating all samples with the same value and weight. On average, IM-KNN improves the accuracy, precision, and recall of traditional KNN by 16.80%, 17.08%, and 16.98%, respectively, across 12 benchmark datasets. Experiments on four large-scale datasets further highlight IM-KNN's robustness to noise, imbalanced data, and skewed distributions.

View on arXiv
Comments on this paper