4.7 Article

Evidential classification and feature selection for cyber-threat hunting

Journal

KNOWLEDGE-BASED SYSTEMS
Volume 226, Issue -, Pages -

Publisher

ELSEVIER
DOI: 10.1016/j.knosys.2021.107120

Keywords

Feature selection; Dempster-Shafer Theory; Evidence theory; Evidential classification; Logistic regression

Ask authors/readers for more resources

The study introduces a novel evidence-based feature selection method for network security, allowing security analysts to rank features in terms of uncertainty levels without expert knowledge. This approach enables fast and accurate detection and differentiation of cyber threats, outperforming or at least matching state-of-the-art techniques.
In recent years, there has been an immense research interest in applying Machine Learning for defending networked systems from cyber threats. A particular challenge in this domain is the identification and selection of appropriate features that ensure prompt and correct cyber threat detection. This work proposes a novel approach that leverages recent advances in evidence theory to provide a deep insight on the effect of each feature's uncertainty on the overall classification decision. As a result, a network security analyst may rank the features in a dataset from the most to the least ambiguous, without requiring expert domain knowledge in cyber threats. Ultimately, this enables the creation of cyber threat phenotypes, which may be used to detect and differentiate between similarly manifested cyber threats. The proposed approach is evaluated on a recent, challenging scenario of network security attacks and compared against multiple feature selection techniques. Based on the selected features, cyber threat classification analysis is performed using seven state-of-the-art ML classification algorithms. The results indicate the proposed evidence-based feature selection method performs better, or, at least as good, to the state-of-the-art. Against the best performing state-of-the-art technique, Decision Tree, the proposed technique's features enabled the classification process to take place in 93.25% of the time, whilst maintaining a high F1 Score of 0.99. Furthermore, the proposed technique's features enable a faster classification process requiring, on average, just 29.25% of the time compared to the average across other evaluated techniques. (C) 2021 Elsevier B.V. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available