Drawing Parallels between Multi-Label Classification and Multi-Target Regression

28 November 2012

Abstract

Although multi-label classification can be seen as a specific case of multi-target regression, the recent advances in this field motivate a study of whether newer state-of-the-art algorithms developed for multi-label classification are applicable and equally successful in the domain of multi-target regression. In this paper we introduce two new algorithms for multi-target regression: multi-target stacking (MTS) and ensemble of regressor chains (ERC). Our methods, inspired by two popular multi-label classification approaches, are based on a single-target decomposition of the multi-target problem and the idea of treating the other prediction targets as additional input variables which augment the input space. We detect two important shortcomings on both methods, which are also relevant for their classification counterparts, and develop extensions for tackling them. The first extension is related to the methodology used to create the additional input variables in these methods, where we observe that internal cross-validation is the best approach. The second extension is the addition of an explicit feature selection step, which aims at removing irrelevant and redundant meta variables from the input space. All methods and their extensions are empirically evaluated on 12 multi-target regression real-world data sets, 8 of which are first introduced in this paper and are made publicly available for future benchmarks. The results of an extensive comparison against the baseline (single-target) approach and high-performing methods from the literature, show that the proposed techniques are able to advance the state-of-the-art in multi-target regression in terms of predictive accuracy. In particular, ERC equipped with the proposed extensions achieves the best overall accuracy.

View on arXiv

Comments on this paper