A hybrid ensemble method with negative correlation learning for regression

Machine-mediated learning (ML), 2021

6 April 2021

Abstract

Hybrid ensemble, an essential branch of ensembles, has flourished in numerous machine learning problems, especially regression. Several studies have confirmed the importance of diversity; however, previous ensembles only consider diversity in the sub-model training stage, with limited improvement compared to single models. In contrast, this study selects and weights sub-models from a heterogeneous model pool automatically. It solves an optimization problem using an interior-point filtering linear-search algorithm. This optimization problem innovatively incorporates negative correlation learning as a penalty term, with which a diverse model subset can be selected. Experimental results show some meaningful points. Model pool construction requires different classes of models, with all possible parameter sets for each class as sub-models. The best sub-models from each class are selected to construct an NCL-based ensemble, which is far more better than the average of the sub-models. Furthermore, comparing with classical constant and non-constant weighting methods, NCL-based ensemble has a significant advantage in several prediction metrics. In practice, it is difficult to conclude the optimal sub-model for a dataset prior due to the model uncertainty. However, our method would achieve comparable accuracy as the potential optimal sub-models on RMSE metric. In conclusion, the value of this study lies in its ease of use and effectiveness, allowing the hybrid ensemble to embrace both diversity and accuracy.

View on arXiv

Comments on this paper