☆ 3.8 Proceedings Paper

Incomplete Big Data Clustering Algorithm Using Feature Selection and Partial Distance

2014 5TH INTERNATIONAL CONFERENCE ON DIGITAL HOME (ICDH) (2014)

期刊

2014 5TH INTERNATIONAL CONFERENCE ON DIGITAL HOME (ICDH)

卷 -, 期 -, 页码 263-266

出版社

IEEE

DOI: 10.1109/ICDH.2014.57

关键词

big data; incomplete data clustering; feature subset selection; cluster analysis

类别

Computer Science, Interdisciplinary Applications

资金

Liaoning Provincial Natural Science Foundation of China [201202032]
Inner Mongolia University of Finance and Economics of China [KYZ1303]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

Incomplete data clustering plays an important role in the big data analysis and processing. Existing algorithms for clustering incomplete high-dimensional big data have low performances in both efficiency and effectiveness. The paper proposes an incomplete high-dimensional big data clustering algorithm based on feature selection and partial distance strategy. First, a hierarchical clustering-based feature subset selection algorithm is designed to reduce the dimensions of the data set. Next, a parallel k-means algorithm based on partial distance is derived to cluster the selected data subset in the first step. Experimental results demonstrate that the proposed algorithm achieves better clustering accuracy than the existing algorithms and takes significantly less time than other algorithms for clustering high-dimensional big data.

Incomplete Big Data Clustering Algorithm Using Feature Selection and Partial Distance

期刊

2014 5TH INTERNATIONAL CONFERENCE ON DIGITAL HOME (ICDH)

出版社

IEEE

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Incomplete Big Data Clustering Algorithm Using Feature Selection and Partial Distance

期刊

2014 5TH INTERNATIONAL CONFERENCE ON DIGITAL HOME (ICDH)

出版社

IEEE

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文