☆ 4.7 Article

An entropy-based density peaks clustering algorithm for mixed type data employing fuzzy neighborhood

KNOWLEDGE-BASED SYSTEMS (2017)

期刊

KNOWLEDGE-BASED SYSTEMS

卷 133, 期 -, 页码 294-313

出版社

ELSEVIER

DOI: 10.1016/j.knosys.2017.07.027

关键词

Entropy; Density peaks clustering; Mixed type data; Fuzzy neighborhood

类别

Computer Science, Artificial Intelligence

资金

National Natural Science Foundation of China [61672522, 61379101]
China Postdoctoral Science Foundation [2016M601910]
Priority Academic Program Development of Jiangsu Higher Education Institutions (PAPD)
Jiangsu Collaborative Innovation Center on Atmospheric Environment and Equipment Technology (CICAEET)

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

Most clustering algorithms rely on the assumption that data simply contains numerical values. In fact, however, data sets containing both numerical and categorical attributes are ubiquitous in real-world tasks, and effective grouping of such data is an important yet challenging problem. Currently most algorithms are sensitive to initialization and are generally unsuitable for non-spherical distribution data. For this, we propose an entropy-based density peaks clustering algorithm for mixed type data employing fuzzy neighborhood (DP-MD-FN). Firstly, we propose a new similarity measure for either categorical or numerical attributes which has a uniform criterion. The similarity measure is proposed to avoid feature transformation and parameter adjustment between categorical and numerical values. We integrate this entropy based strategy with the density peaks clustering method. In addition, to improve the robustness of the original algorithm, we use fuzzy neighborhood relation to redefine the local density. Besides, in order to select the cluster centers automatically, a simple determination strategy is developed through introducing the gamma-graph. This method can deal with three types of data: numerical, categorical, and mixed type data. We compare the performance of our algorithm with traditional clustering algorithms, such as K-Modes, K-Prototypes, KL-FCM-GM, EKP and OCIL. Experiments on different benchmark data sets demonstrate the effectiveness and robustness of the proposed algorithm. (C) 2017 Elsevier B.V. All rights reserved.

An entropy-based density peaks clustering algorithm for mixed type data employing fuzzy neighborhood

期刊

KNOWLEDGE-BASED SYSTEMS

出版社

ELSEVIER

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

An entropy-based density peaks clustering algorithm for mixed type data employing fuzzy neighborhood

期刊

KNOWLEDGE-BASED SYSTEMS

出版社

ELSEVIER

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文