4.5 Article

A MapReduce-based artificial bee colony for large-scale data clustering

期刊

PATTERN RECOGNITION LETTERS
卷 93, 期 -, 页码 78-84

出版社

ELSEVIER
DOI: 10.1016/j.patrec.2016.07.027

关键词

Artificial Bee Colony (ABC); MapReduce; Data mining; Clustering; Distributed computing; Hadoop

资金

  1. Faculty of Engineering at Sriracha, Kasetsart University Sriracha Campus [2559/1]

向作者/读者索取更多资源

The progress of technology has been a significant factor in increasing the growth of digital data. Therefore, good data analysis is a necessity for making better decisions. Clustering is one of the most important elements in the field of data analysis. However, the clustering of very large datasets is considered a primary concern. The improvement of computational models along with the ability to cluster huge volumes of data within a reasonable amount of time is thus required. MapReduce is a powerful programming model and an associated implement for processing large datasets with a parallel, distributed algorithm in a computing cluster. In this paper, a MapReduce-based artificial bee colony called MR-ABC is proposed for data clustering. The ABC is implemented based on the MapReduce model in the Hadoop framework and utilized to optimize the assignment of the large data instances to clusters with the objective of minimizing the sum of the squared Euclidean distance between each data instance and the centroid of the cluster to which it belongs. The experimental results demonstrate that our proposed algorithm is well suited for dealing with massive amounts of data, while the quality level of the clustering results is still maintained. (C) 2016 Elsevier B.V. All rights reserved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据