3.8 Proceedings Paper

Incomplete Big Data Clustering Algorithm Using Feature Selection and Partial Distance

出版社

IEEE
DOI: 10.1109/ICDH.2014.57

关键词

big data; incomplete data clustering; feature subset selection; cluster analysis

资金

  1. Liaoning Provincial Natural Science Foundation of China [201202032]
  2. Inner Mongolia University of Finance and Economics of China [KYZ1303]

向作者/读者索取更多资源

Incomplete data clustering plays an important role in the big data analysis and processing. Existing algorithms for clustering incomplete high-dimensional big data have low performances in both efficiency and effectiveness. The paper proposes an incomplete high-dimensional big data clustering algorithm based on feature selection and partial distance strategy. First, a hierarchical clustering-based feature subset selection algorithm is designed to reduce the dimensions of the data set. Next, a parallel k-means algorithm based on partial distance is derived to cluster the selected data subset in the first step. Experimental results demonstrate that the proposed algorithm achieves better clustering accuracy than the existing algorithms and takes significantly less time than other algorithms for clustering high-dimensional big data.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

3.8
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据