4.6 Article

NaNOD: A natural neighbour-based outlier detection algorithm

Journal

NEURAL COMPUTING & APPLICATIONS
Volume 33, Issue 6, Pages 2107-2123

Publisher

SPRINGER LONDON LTD
DOI: 10.1007/s00521-020-05068-2

Keywords

Outlier detection; Natural neighbour; Kernel density estimation; Adaptive kernel width

Funding

  1. Department of Computer Science and Engineering, Indian Institute of Technology (Indian School of Mines), Dhanbad, India

Ask authors/readers for more resources

Outlier detection is a crucial task in data mining applications. A new unsupervised density-based outlier detection algorithm is proposed in this study to address the weaknesses of existing algorithms. By utilizing adaptive parameter acquisition and weighted kernel density estimation, along with two types of nearest neighbors and Gaussian kernel function, the proposed algorithm demonstrates improved outlier detection performance.
Outlier detection is an essential task in data mining applications which include, military surveillance, tax fraud detection, telecommunication, etc. In recent years, outlier detection received significant attention compared to other problem of discoveries. The focus on this has resulted in the growth of several outlier detection algorithms, mostly concerning the strategy based on distance or density. However, each strategy has intrinsic weaknesses. The distance-based techniques have the problem of local density, while the density-based method is recognized as having an issue of a low-density pattern. Also, most of the existing outlier detection algorithms have a parameter selection problem, which leads to poor detection results. In this article, we present an unsupervised density-based outlier detection algorithm to deal with these shortcomings. The proposed algorithm uses a Natural Neighbour (NaN) concept, to obtain a parameter called Natural Value (NV) adaptively, and a Weighted Kernel Density Estimation (WKDE) method to estimate the density at the location of an object. Besides, our proposed algorithm employed two different categories of nearest neighbours, k Nearest Neighbours (kNN), and Reverse Nearest Neighbours (RNN), which make our system flexible in modelling different data patterns. A Gaussian kernel function is adopted to achieve smoothness in the measure. Further, we use an adaptive kernel width concept to enhance the discrimination power between normal and outlier samples. The formal analysis and extensive experiments carried out on both artificial and real datasets demonstrate that this technique can achieve better outlier detection performance.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available