☆ 4.5 Article

MapReduce-based fuzzy c-means clustering algorithm: implementation and scalability

INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS (2015)

期刊

INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS

卷 6, 期 6, 页码 923-934

出版社

SPRINGER HEIDELBERG

DOI: 10.1007/s13042-015-0367-0

关键词

MapReduce; Hadoop; Scalability

类别

Computer Science, Artificial Intelligence

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

The management and analysis of big data has been identified as one of the most important emerging needs in recent years. This is because of the sheer volume and increasing complexity of data being created or collected. Current clustering algorithms can not handle big data, and therefore, scalable solutions are necessary. Since fuzzy clustering algorithms have shown to outperform hard clustering approaches in terms of accuracy, this paper investigates the parallelization and scalability of a common and effective fuzzy clustering algorithm named fuzzy c-means (FCM) algorithm. The algorithm is parallelized using the MapReduce paradigm outlining how the Map and Reduce primitives are implemented. A validity analysis is conducted in order to show that the implementation works correctly achieving competitive purity results compared to state-of-the art clustering algorithms. Furthermore, a scalability analysis is conducted to demonstrate the performance of the parallel FCM implementation with increasing number of computing nodes used.

MapReduce-based fuzzy c-means clustering algorithm: implementation and scalability

期刊

INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS

出版社

SPRINGER HEIDELBERG

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

MapReduce-based fuzzy c-means clustering algorithm: implementation and scalability

期刊

INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS

出版社

SPRINGER HEIDELBERG

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文