4.7 Article

Variable Weighting in Fuzzy k-Means Clustering to Determine the Number of Clusters

Journal

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING
Volume 32, Issue 9, Pages 1838-1853

Publisher

IEEE COMPUTER SOC
DOI: 10.1109/TKDE.2019.2911582

Keywords

Fuzzy k-means; clustering; number of clusters; data mining; variable weighting

Funding

  1. National Key R&D Plan Key Special Plan on Public Security Risk Mitigation/Response [2017YFC0804003]
  2. Technologies and Equipment Guangdong Education Bureau Fund [2017KTSCX166]
  3. Science and Technology Innovation Committee Foundation of Shenzhen [JCYJ20170817112037041, ZDSYS201703031748284002E]

Ask authors/readers for more resources

One of the most significant problems in cluster analysis is to determine the number of clusters in unlabeled data, which is the input for most clustering algorithms. Some methods have been developed to address this problem. However, little attention has been paid on algorithms that are insensitive to the initialization of cluster centers and utilize variable weights to recover the number of clusters. To fill this gap, we extend the standard fuzzy k-means clustering algorithm. It can automatically determine the number of clusters by iteratively calculating the weights of all variables and the membership value of each object in all clusters. Two new steps are added to the fuzzy k-means clustering process. One of them is to introduce a penalty term to make the clustering process insensitive to the initial cluster centers. The other one is to utilize a formula for iterative updating of variable weights in each cluster based on the current partition of data. Experimental results on real-world and synthetic datasets have shown that the proposed algorithm effectively determined the correct number of clusters while initializing the different number of cluster centroids. We also tested the proposed algorithm on gene data to determine a subset of important genes.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available