4.6 Article

Unsupervised Outlier Detection for Mixed-Valued Dataset Based on the Adaptive k-Nearest Neighbor Global Network

期刊

IEEE ACCESS
卷 10, 期 -, 页码 32093-32103

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/ACCESS.2022.3161481

关键词

Unsupervised outlier detection; k-nearest neighbor; mixed-valued dataset; network model; random walk process

资金

  1. National Natural Science Foundation, China [51505480, 72001203]
  2. Graduate Research and Innovation Projects of Jiangsu Province, China [KYCX21_2477]

向作者/读者索取更多资源

An unsupervised outlier detection method for datasets with mixed-valued attributes based on an adaptive k-NN global network is proposed in this study. By introducing an adaptive search algorithm and a Heterogeneous Euclidean-Overlap Metric for distance measurement, as well as using transition probabilities to limit behaviors of random walkers, the method effectively detects outliers in the dataset.
Outlier detection aims to reveal data patterns different from existing data. Benefit from its good robustness and interpretability, the outlier detection method for numerical dataset based on k-Nearest Neighbor (k-NN) network has attracted much attention in recent years. However, the datasets produced in many practical contexts tend to contain both numerical and categorical attributes, that are, the datasets with mixed-valued attributes (DMAs). And, the selection of k is also an issue that is worthy of attention for unlabeled datasets. Therefore, an unsupervised outlier detection method for DMA based on an adaptive k-NN global network is proposed. First, an adaptive search algorithm for the appropriate value of k considering the distribution characteristics of datasets is introduced. Next, the distance between mixed-valued data objects is measured based on the Heterogeneous Euclidean-Overlap Metric, and the k-NN of a data object is obtained. Then, an adaptive k-NN global network is constructed based on the neighborhood relationships between data objects, and a customized random walk process is executed on it to detect outliers by using the transition probability to limit behaviors of the random walker. Finally, the effectiveness, accuracy, and applicability of the proposed method are demonstrated by a detailed experiment.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据