Journal
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE
Volume 141, Issue -, Pages 40-53Publisher
ELSEVIER
DOI: 10.1016/j.future.2022.10.024
Keywords
Active learning; IoT botnet; Botnet detection; Machine learning; Query learning; Internet of things; Intrusion detection; IoT
Categories
Ask authors/readers for more resources
The active learning approach for machine learning is especially useful in environments with abundant unlabeled data and high labeling costs. This study focuses on the application of different active learning approaches in the context of security operations centers (SOCs) and IoT botnet detection. Through thorough benchmarking, the study evaluates the effectiveness of uncertainty sampling, ranked batch-mode sampling, and query by committee strategies for selecting the best query instance for learning. The results demonstrate that the active learning approach can significantly improve detection models, even with reduced data compared to passive approaches, and the impact of wrong-labeled data is also explored.
The active learning approach for machine learning can greatly benefit those environments where a wealth of unlabeled data is available, and the labeling cost of the data can be restrictive. In this regard, security operations centers (SOCs) can take advantage of the human expertise available to improve machine learning-based detection models using the active learning approach. In the context of SOC operations and IoT botnet detection, our study provides a thorough benchmarking of the application of different active learning approaches within the framework of pool-based sampling. The selection of the optimal query instance for learning is evaluated using uncertainty sampling, ranked batch -mode sampling, and query by committee strategies. Our results show that the active learning approach can help to generate better detection models using all the active learning query strategies tested in our benchmarking setup. Leveraging the human-machine interaction can produce high-performance models in the context of IoT botnet detection using significantly less data than the passive approaches traditionally used for the generation of machine learning-based detection systems. Additionally, the impact of wrong-labeled data in the active learning implementation is explored.(c) 2022 Elsevier B.V. All rights reserved.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available