47
8

Supervised Fuzzy Partitioning

Abstract

Centroid-based methods including k-means and fuzzy c-means are known as effective and easy-to-implement approaches to clustering purposes in many areas of application. However, these algorithms cannot be directly applied to supervised tasks. We propose a generative model extending the centroid-based clustering approach to be applicable to classification tasks. Given an arbitrary loss function, our approach, termed Supervised Fuzzy Partitioning (SFP), incorporates labels information into its objective function through a surrogate term penalizing the empirical risk. We also fuzzify the partition and assign weights to features alongside entropy-based regularization terms, enabling the method to capture more complex patterns, to identify significant features, and to yield better performance facing high-dimensional data. An iterative algorithm based on block coordinate descent scheme was formulated to efficiently find a local optimizer. Extensive classification experiments on synthetic, real-world, and high-dimensional datasets demonstrated that the SFP performance is competitive with state-of-the-art algorithms such as random forest and SVM. Our method has a major advantage over such methods in that it not only leads to a flexible nonlinear model but also can exploit any loss function in training phase without compromising computational efficiency.

View on arXiv
Comments on this paper