4.6 Article

A weighting k-modes algorithm for subspace clustering of categorical data

Journal

NEUROCOMPUTING
Volume 108, Issue -, Pages 23-30

Publisher

ELSEVIER
DOI: 10.1016/j.neucom.2012.11.009

Keywords

Subspace clustering; Weight; k-Modes algorithm; Categorical data

Funding

  1. National Natural Science Foundation of China [71031006, 70971080, 60970014]
  2. Special Prophase Project on National Key Basic Research and Development Program of China (973) [2011CB311805]
  3. Natural Science Foundation of Shanxi [2010021016-2, 2010011021-1]
  4. China Postdoctoral Science Foundation [2012M510046]

Ask authors/readers for more resources

Traditional clustering algorithms consider all of the dimensions of an input data set equally. However, in the high dimensional data, a common property is that data points are highly clustered in subspaces, which means classes of objects are categorized in subspaces rather than the entire space. Subspace clustering is an extension of traditional clustering that seeks to find clusters in different subspaces within a data set. In this paper, a weighting k-modes algorithm is presented for subspace clustering of categorical data and its corresponding time complexity is analyzed as well. In the proposed algorithm, an additional step is added to the k-modes clustering process to automatically compute the weight of all dimensions in each cluster by using complement entropy. Furthermore, the attribute weight can be used to identify the subsets of important dimensions that categorize different clusters. The effectiveness of the proposed algorithm is demonstrated with real data sets and synthetic data sets. (C) 2012 Elsevier B.V. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available