4.7 Article

Optimization scheme for intrusion detection scheme GBDT in edge computing center

期刊

COMPUTER COMMUNICATIONS
卷 168, 期 -, 页码 136-145

出版社

ELSEVIER
DOI: 10.1016/j.comcom.2020.12.007

关键词

Intrusion detection; Gradient boosting decision tree; Machine learning; Ensemble learning

资金

  1. National Natural Science Foundation of China (NSFC) [61872205]
  2. Source Innovation Program of Qingdao [18-2-2-56-jch]
  3. Shandong Provincial Natural Science Foundation [ZR2019MF018]

向作者/读者索取更多资源

The paper proposes an optimization scheme for GBDT to improve its detection precision and training efficiency, addressing issues such as imbalanced data and high dimensional data characteristics. The scheme includes using MSMOTE to address data imbalance, RFE-HCV to reduce data feature dimensionality, and FGS algorithm for parameter optimization efficiency. The experimental results show that the new scheme ensures data balance, eliminates redundant data features, and significantly improves parameter optimization efficiency.
Combination of edge computing technologies and machine learning help to put edge intelligence into practice. Industrial Internet of Things (IIoT) is one of its most typical applications. But this system can be easily attacked in the process of using edge computing center to process localized perception data. Intrusion detection technologies based on machine learning provide strong security for edge computing center, in which the most widely used is gradient boosting decision tree (i.e., GBDT). But still this model faces with problems such as imbalanced data, high dimensional data characteristics, and low efficiency of parameter optimization. To solve these problems, this paper proposes an optimization scheme for GBDT to improve its detection precision and training efficiency. First, to solve the problem of imbalanced data in data set, we propose a margin synthetic minority oversampling technique (i.e., MSMOTE), which can expand the non-noise data with less sample size, namely, small sample, to ensure equilibrium distribution of data. Second, to lower the data feature dimensionality, we propose a recursive feature elimination-hierarchy cross validation algorithm (i.e., RFE-HCV). The new algorithm eliminates redundant data features recursively according to feature weight, to strengthen the relationship between features and goals. It also designs hierarchy system to ensure equal proportionment of data category (attack category) in training set and testing set at cross validation stage. Next, in order to improve the efficiency of parameter optimization in model training process, we develop a flexible grid search algorithm (i.e., FGS) to improve retrieval efficiency of optimum parameters. Finally, the detailed experimental results show that our new scheme ensures data balance in dataset and eliminates redundant data features, and helps the efficiency of parameter optimization increase by three times. Moreover, the new scheme defends against intrusion more effectively.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据