☆ 4.7 Article

A feature group weighting method for subspace clustering of high-dimensional data

PATTERN RECOGNITION (2012)

期刊

PATTERN RECOGNITION

卷 45, 期 1, 页码 434-446

出版社

ELSEVIER SCI LTD

DOI: 10.1016/j.patcog.2011.06.004

关键词

Data mining; Subspace clustering; k-Means; Feature weighting; High-dimensional data analysis

类别

Computer Science, Artificial Intelligence Engineering, Electrical & Electronic

资金

NSFC [61073195]
Shenzhen New Industry Development Fund [CX8201005250024A, CXB201005250021A]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

This paper proposes a new method to weight subspaces in feature groups and individual features for clustering high-dimensional data. In this method, the features of high-dimensional data are divided into feature groups, based on their natural characteristics. Two types of weights are introduced to the clustering process to simultaneously identify the importance of feature groups and individual features in each cluster. A new optimization model is given to define the optimization process and a new clustering algorithm FG-k-means is proposed to optimize the optimization model. The new algorithm is an extension to k-means by adding two additional steps to automatically calculate the two types of subspace weights. A new data generation method is presented to generate high-dimensional data with clusters in subspaces of both feature groups and individual features. Experimental results on synthetic and real-life data have shown that the FG-k-means algorithm significantly outperformed four k-means type algorithms, i.e., k-means, W-k-means, LAC and EWKM in almost all experiments. The new algorithm is robust to noise and missing values which commonly exist in high-dimensional data. (C) 2011 Elsevier Ltd. All rights reserved.

A feature group weighting method for subspace clustering of high-dimensional data

期刊

PATTERN RECOGNITION

出版社

ELSEVIER SCI LTD

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

A feature group weighting method for subspace clustering of high-dimensional data

期刊

PATTERN RECOGNITION

出版社

ELSEVIER SCI LTD

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文