☆ 3.8 Article

Outlier Detection in High Dimensional Data

JOURNAL OF INFORMATION & KNOWLEDGE MANAGEMENT (2020)

期刊

JOURNAL OF INFORMATION & KNOWLEDGE MANAGEMENT

卷 19, 期 1, 页码 -

出版社

WORLD SCIENTIFIC PUBL CO PTE LTD

DOI: 10.1142/S0219649220400134

关键词

Outlier detection; high dimensional data; PCA; KDE

类别

Information Science & Library Science

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

High-dimensional data poses unique challenges in outlier detection process. Most of the existing algorithms fail to properly address the issues stemming from a large number of features. In particular, outlier detection algorithms perform poorly on dataset of small size with a large number of features. In this paper, we propose a novel outlier detection algorithm based on principal component analysis and kernel density estimation. The proposed method is designed to address the challenges of dealing with high-dimensional data by projecting the original data onto a smaller space and using the innate structure of the data to calculate anomaly scores for each data point. Numerical experiments on synthetic and real-life data show that our method performs well on high-dimensional data. In particular, the proposed method outperforms the benchmark methods as measured by F-1-score. Our method also produces better-than-average execution times compared with the benchmark methods.

Outlier Detection in High Dimensional Data

期刊

JOURNAL OF INFORMATION & KNOWLEDGE MANAGEMENT

出版社

WORLD SCIENTIFIC PUBL CO PTE LTD

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Outlier Detection in High Dimensional Data

期刊

JOURNAL OF INFORMATION & KNOWLEDGE MANAGEMENT

出版社

WORLD SCIENTIFIC PUBL CO PTE LTD

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文