☆ 4.5 Article

A MapReduce-based artificial bee colony for large-scale data clustering

PATTERN RECOGNITION LETTERS (2017)

期刊

PATTERN RECOGNITION LETTERS

卷 93, 期 -, 页码 78-84

出版社

ELSEVIER

DOI: 10.1016/j.patrec.2016.07.027

关键词

Artificial Bee Colony (ABC); MapReduce; Data mining; Clustering; Distributed computing; Hadoop

类别

Computer Science, Artificial Intelligence

资金

Faculty of Engineering at Sriracha, Kasetsart University Sriracha Campus [2559/1]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

The progress of technology has been a significant factor in increasing the growth of digital data. Therefore, good data analysis is a necessity for making better decisions. Clustering is one of the most important elements in the field of data analysis. However, the clustering of very large datasets is considered a primary concern. The improvement of computational models along with the ability to cluster huge volumes of data within a reasonable amount of time is thus required. MapReduce is a powerful programming model and an associated implement for processing large datasets with a parallel, distributed algorithm in a computing cluster. In this paper, a MapReduce-based artificial bee colony called MR-ABC is proposed for data clustering. The ABC is implemented based on the MapReduce model in the Hadoop framework and utilized to optimize the assignment of the large data instances to clusters with the objective of minimizing the sum of the squared Euclidean distance between each data instance and the centroid of the cluster to which it belongs. The experimental results demonstrate that our proposed algorithm is well suited for dealing with massive amounts of data, while the quality level of the clustering results is still maintained. (C) 2016 Elsevier B.V. All rights reserved.

A MapReduce-based artificial bee colony for large-scale data clustering

期刊

PATTERN RECOGNITION LETTERS

出版社

ELSEVIER

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

A MapReduce-based artificial bee colony for large-scale data clustering

期刊

PATTERN RECOGNITION LETTERS

出版社

ELSEVIER

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文