4.7 Article

Semi-supervised anomaly detection algorithms: A comparative summary and future research directions

Journal

KNOWLEDGE-BASED SYSTEMS
Volume 218, Issue -, Pages -

Publisher

ELSEVIER
DOI: 10.1016/j.knosys.2021.106878

Keywords

Review; Semi-supervised classification; Anomaly detection; Up-to-date comparison; Meta-analysis study

Ask authors/readers for more resources

While anomaly detection is a well-studied field, this paper focuses on empirically studying the performance of 29 semi-supervised anomaly detection algorithms on benchmark databases. Results show that BRM is a robust classifier, outperforming other techniques, while some algorithms perform poorly under specific conditions.
While anomaly detection is relatively well-studied, it remains a topic of ongoing interest and challenge, as our society becomes increasingly interconnected and digitalized. In this paper, we focus on existing anomaly detection approaches, by empirically studying the performance of 29 semi-supervised anomaly detection algorithms on 95 benchmark imbalanced databases from the KEEL repository. These include well-established and commonly used classifiers (e.g., One-Class Support Vector Machine (ocSVM) and Isolation Forest) and recent proposals (e.g., BRM and XGBOD). Findings from our in-depth empirical study show that BRM is a robust classifier, in terms of achieving better classification results than the other 28 state-of-the-art techniques on diverse anomaly detection problems. We also observe that OCKRA, Isolation Forest, and ocSVM achieve good performance overall AUC, but poor classification results on databases where the number of objects is equal or greater than 1,460, all features are nominal, or the imbalance ratio is equal or greater than 39.14. (c) 2021 Elsevier B.V. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available