☆ 4.5 Article

MapReduce-based fuzzy c-means clustering algorithm: implementation and scalability

INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS (2015)

Journal

INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS

Volume 6, Issue 6, Pages 923-934

Publisher

SPRINGER HEIDELBERG

DOI: 10.1007/s13042-015-0367-0

Keywords

MapReduce; Hadoop; Scalability

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

The management and analysis of big data has been identified as one of the most important emerging needs in recent years. This is because of the sheer volume and increasing complexity of data being created or collected. Current clustering algorithms can not handle big data, and therefore, scalable solutions are necessary. Since fuzzy clustering algorithms have shown to outperform hard clustering approaches in terms of accuracy, this paper investigates the parallelization and scalability of a common and effective fuzzy clustering algorithm named fuzzy c-means (FCM) algorithm. The algorithm is parallelized using the MapReduce paradigm outlining how the Map and Reduce primitives are implemented. A validity analysis is conducted in order to show that the implementation works correctly achieving competitive purity results compared to state-of-the art clustering algorithms. Furthermore, a scalability analysis is conducted to demonstrate the performance of the parallel FCM implementation with increasing number of computing nodes used.

MapReduce-based fuzzy c-means clustering algorithm: implementation and scalability

Journal

INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS

Publisher

SPRINGER HEIDELBERG

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

MapReduce-based fuzzy c-means clustering algorithm: implementation and scalability

Journal

INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS

Publisher

SPRINGER HEIDELBERG

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper