4.5 Article

A hybrid clustering algorithm based on missing attribute interval estimation for incomplete data

期刊

PATTERN ANALYSIS AND APPLICATIONS
卷 18, 期 2, 页码 377-384

出版社

SPRINGER
DOI: 10.1007/s10044-014-0376-8

关键词

Incomplete data set; Intervals reconstruction; Particle swarm; Fuzzy c-means; Clustering

资金

  1. National Nature Science Foundation of China [61174115, 51104044]

向作者/读者索取更多资源

Partially missing data sets are a prevailing problem in clustering analysis. We propose a hybrid algorithm combining fuzzy clustering with particle swarm optimization (PSO) for incomplete data clustering, and missing attributes are represented as intervals. Furthermore, we develop a neighbor interval reconstruction (NIR) method based on pre-classification results that estimates the nearest-neighbor interval of missing attribute using the nearest-neighbor rule, which avoids endpoints of intervals determined by different species information, thereby improving the accuracy of missing attribute intervals and enhancing the robustness of missing attribute imputation. Then, the PSO and fuzzy c-means hybrid algorithm are used for clustering the interval-valued data set, and the global optimization ability of the PSO can improve the accuracy of clustering results compared with gradient-based optimization methods. The experimental results for several UCI data sets show the superiority of the proposed NIR hybrid algorithm.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据