4.3 Article

Benchmarking the Clustering Performances of Evolutionary Algorithms: A Case Study on Varying Data Size

期刊

IRBM
卷 41, 期 5, 页码 267-275

出版社

ELSEVIER SCIENCE INC
DOI: 10.1016/j.irbm.2020.06.002

关键词

Clustering; Optimization; Evolutionary algorithms; PSO; GWO; BBO; K-means

向作者/读者索取更多资源

Background and objective: Clustering is a widely used popular method for data analysis within many clustering algorithms for years. Today it is used in many predictions, collaborative filtering and automatic segmentation systems on different domains. Also, to be broadly used in practice, such clustering algorithms need to give both better performance and robustness when compared to the ones currently used. In recent years, evolutionary algorithms are used in many domains since they are robust and easy to implement. And many clustering problems can be easily solved with such algorithms if the problem is modeled as an optimization problem. In this paper, we present an optimization approach for clustering by using four well-known evolutionary algorithms which are Biogeography-Based Optimization (BBO), Grey Wolf Optimization (GWO), Genetic Algorithm (GA) and Particle Swarm Optimization (PSO). Method: the objective function has been specified to minimize the total distance from cluster centers to the data points. Euclidean distance is used for distance calculation. We have applied this objective function to the given algorithms both to find the most efficient clustering algorithm and to compare the clustering performances of algorithms against different data sizes. In order to benchmark the clustering performances of algorithms in the experiments, we have used a number of datasets with different data sizes such as some small scale, medium and big data. The clustering performances have been compared to K-means as it is a widely used clustering algorithm for years in literature. Rand Index, Adjusted Rand Index, Mirkin's Index and Hubert's Index have been considered as parameters for evaluating the clustering performances. Result: As a result of the clustering experiments of algorithms over different datasets with varying data sizes according to the specified performance criteria, GA and GWO algorithms show better clustering performances among the others. Conclusions: The results of the study showed that although the algorithms have shown satisfactory clustering results on small and medium scale datasets, the clustering performances on Big data need to be improved. (C) 2020 AGBM. Published by Elsevier Masson SAS. All rights reserved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.3
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据