4.7 Article

Evidential classification and feature selection for cyber-threat hunting

期刊

KNOWLEDGE-BASED SYSTEMS
卷 226, 期 -, 页码 -

出版社

ELSEVIER
DOI: 10.1016/j.knosys.2021.107120

关键词

Feature selection; Dempster-Shafer Theory; Evidence theory; Evidential classification; Logistic regression

向作者/读者索取更多资源

The study introduces a novel evidence-based feature selection method for network security, allowing security analysts to rank features in terms of uncertainty levels without expert knowledge. This approach enables fast and accurate detection and differentiation of cyber threats, outperforming or at least matching state-of-the-art techniques.
In recent years, there has been an immense research interest in applying Machine Learning for defending networked systems from cyber threats. A particular challenge in this domain is the identification and selection of appropriate features that ensure prompt and correct cyber threat detection. This work proposes a novel approach that leverages recent advances in evidence theory to provide a deep insight on the effect of each feature's uncertainty on the overall classification decision. As a result, a network security analyst may rank the features in a dataset from the most to the least ambiguous, without requiring expert domain knowledge in cyber threats. Ultimately, this enables the creation of cyber threat phenotypes, which may be used to detect and differentiate between similarly manifested cyber threats. The proposed approach is evaluated on a recent, challenging scenario of network security attacks and compared against multiple feature selection techniques. Based on the selected features, cyber threat classification analysis is performed using seven state-of-the-art ML classification algorithms. The results indicate the proposed evidence-based feature selection method performs better, or, at least as good, to the state-of-the-art. Against the best performing state-of-the-art technique, Decision Tree, the proposed technique's features enabled the classification process to take place in 93.25% of the time, whilst maintaining a high F1 Score of 0.99. Furthermore, the proposed technique's features enable a faster classification process requiring, on average, just 29.25% of the time compared to the average across other evaluated techniques. (C) 2021 Elsevier B.V. All rights reserved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据