PFGE: Parsimonious Fast Geometric Ensembling of DNNs

International Conference on Intelligent Computing (ICIC), 2022

14 February 2022

ArXiv (abs)PDF HTML Github

Abstract

Ensemble methods have been widely used to improve the generalization performance of machine learning methods, while they are struggling to apply in deep learning, as training an ensemble of deep neural networks (DNNs) and then employing them for inference incur an extremely high cost for model training and test-time computation. Recently, several advanced techniques, such as fast geometric ensembling (FGE) and snapshot ensemble (SNE), have been proposed. These methods can train the model ensembles in the same time as a single model, thus getting round of the hurdle of training time. However, their costs for model recording and test-time computation remain much higher than their single model based counterparts. Here we propose a parsimonious FGE (PFGE) algorithm that employs a lightweight ensemble of higher-performing DNNs, which are generated by a series of successively performed stochastic weight averaging procedures. Experimental results across different advanced DNN architectures on different datasets, namely CIFAR-{10,100} and Imagenet, demonstrate its performance. Results show that, compared with state-of-the-art methods, PFGE has a comparable even better performance in terms of generalization and calibration, at a much-reduced cost for model recording and test-time computation.

View on arXiv

Comments on this paper