4.8 Article

A multiple clustering combination approach based on iterative voting process

Publisher

ELSEVIER
DOI: 10.1016/j.jksuci.2019.09.013

Keywords

Clustering ensemble; Combining multiple clustering; Cooperative clustering; Collaborative clustering

Ask authors/readers for more resources

This paper introduces an Iterative Combining Clusterings Method (ICCM) that iteratively processes the dataset by extracting sub-clusters through a voting process, achieving higher effectiveness and robustness. Experimental results show significant improvements in clustering quality metrics and external validation metrics, confirming the usefulness of the proposed approach compared to other clustering ensemble methods.
This paper addresses the problem of clustering ensemble which aims to combine multiple clusterings into a probably better solution in terms of robustness, novelty and stability. The proposed Iterative Combining Clusterings Method (ICCM) processes iteratively the entire dataset, where each iteration is based on two steps framework. In the first step, different clustering algorithms process the common data set individually and, in the next step, a set of sub-clusters is extracted through a voting process among the data objects. To overcome the ambiguity due to voting, only objects with majority voting are assigned to their correspondent sub-clusters. The remaining objects are then collected and re-clustered in the next iterations. At the end of the iterative process, a clustering algorithm is used to group the obtained sub cluster centres and extract the final clusters of the dataset. Two gene expression datasets and three real-life datasets have been used to evaluate the proposed approach using external and internal criteria. The experimental results demonstrate the effectiveness and robustness of the proposed method, where an improvement up to 16.89% for iris dataset, and up to 14.98% for wine dataset in DB index has been achieved. The external validity metrics confirm the usefulness of the proposed approach by achieving the highest average NMI (%) score of 81.05%, across the datasets compared to different clustering ensemble methods. (c) 2019 The Authors. Production and hosting by Elsevier B.V. on behalf of King Saud University. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.8
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available