Fractionally-Supervised Classification
Traditionally, there are three species of classification: unsupervised, supervised, and semi-supervised. Supervised and semi-supervised classification differ by whether or not weight is given to unlabelled observations in the classification procedure. In unsupervised classification, or clustering, either there are no labelled observations or the labels are ignored altogether. A priori it can very difficult to choose the optimal level of supervision, and the consequences of a sub-optimal choice can be rather severe. A flexible fractionally-supervised approach to classification is introduced, where any level of supervision -- ranging from unsupervised to supervised -- can be attained. Our approach uses a weighted likelihood, wherein weights control the level of supervision. Gaussian mixture models are used as a vehicle to illustrate our fractionally-supervised classification approach; however, it is broadly applicable and variations on the postulated model can easily be made by adjusting the weights. A comparison between our approach and the traditional species is presented using benchmark model-based clustering data.
View on arXiv