Fractionally-Supervised Classification
Traditionally, there are three species of classification: unsupervised, supervised, and semi-supervised. Supervised and semi-supervised classification differ by whether or not weight is given to unlabelled observations in the classification procedure. In unsupervised classification, or clustering, no labels are known and hence full weight is given to unlabelled observations. A priori, it can be very difficult to choose the optimal level of supervision, and the consequences of a sub-optimal choice can be non-trivial. A flexible fractionally-supervised approach to classification is introduced, where any level of supervision --- ranging from unsupervised to supervised --- can be attained. Our approach uses a weighted likelihood, wherein weights control the level of supervision. This paper investigates several choices for the specification of these weights. Gaussian mixture models are used as a vehicle to illustrate our fractionally-supervised classification approach; however, it is broadly applicable and variations on the postulated model can easily be made. A comparison between our approach and the traditional species is presented using simulated and real data.
View on arXiv