315

Scalable unsupervised feature selection via weight stability

Main:21 Pages
Bibliography:5 Pages
4 Tables
Abstract

Unsupervised feature selection is critical for improving clustering performance in high-dimensional data, where irrelevant features can obscure meaningful structure. In this work, we introduce the Minkowski weighted kk-means++, a novel initialisation strategy for the Minkowski Weighted kk-means. Our initialisation selects centroids probabilistically using feature relevance estimates derived from the data itself. Building on this, we propose two new feature selection algorithms, FS-MWK++, which aggregates feature weights across a range of Minkowski exponents to identify stable and informative features, and SFS-MWK++, a scalable variant based on subsampling. We support our approach with a theoretical guarantee under mild assumptions and extensive experiments showing that our methods consistently outperform existing alternatives. Our software can be found at this https URL.

View on arXiv
Comments on this paper