71
355

An Efficient Intrusion Detection System Based on Feature Selection and Ensemble Classifier

Abstract

Intrusion detection systems (IDSs) are extensively used in a network topology aiming to safeguard the integrity and availability of sensitive assets in the protected systems. Although many supervised and unsupervised learning approaches from the field of machine learning have been used to increase the efficacy of IDSs, it is still a problem to deal with lots of redundant and irrelevant data in high-dimensional datasets. In this paper, we introduce a novel machine learning based IDS to increase the accuracy and effiency of classification. Due to the high dimensionality of network traffic, we propose a heuristic algorithm called Correlation-based-Feature-Selection-Bat-Algorithm (CFS-BA) to solve this problem. For feature selection, CFS-BA supports evaluation of the correlation between features and selects the optimal subset for training and testing process. Meanwhile, we introduce an ensemble approach, which combines decisions from C4.5, Random Forest (RF) and Forest by Penalizing Attributes (Forest PA) based on the average of probabilities (AOP) rule. It helps to deal with unbalanced datasets and multi-class classification. Experimental results, using KDDCup'99, NSL-KDD and CIC-IDS2017 datasets, reveal the superiority of our CFS-BA-Ensemble method, over other related techniques under several metrics. Specifically, the proposed IDS reduces the training and testing time from 113.53 and 2.93 to 44.78 and 2.06 on the CIC-IDS2017, respectively. Moreover, compared with other feature selection methods, our proposal achieves the highest F-Measure of 0.998 and lowest False Alarm Rate (FAR) of 0.17% on KDDCup'99.

View on arXiv
Comments on this paper