Journal
INTELLIGENT SYSTEMS AND APPLICATIONS, VOL 1
Volume 542, Issue -, Pages 771-788Publisher
SPRINGER INTERNATIONAL PUBLISHING AG
DOI: 10.1007/978-3-031-16072-1_55
Keywords
Big data; Clustering; DENCLUE; Density clustering; Distributed clustering; Mapreduce framework
Categories
Ask authors/readers for more resources
This paper introduces a parallel approximated variant called MR-VDENCLUE, which is capable of discovering clusters with varying densities and can handle big datasets.
The volume of data generated, processed, and consumed in the digital world is exponentially increasing. The clustering of such a huge volume of data, known as big data, necessitates the development of highly scalable clustering methods. Density-based algorithms have attracted researchers' interest because they help to better understand complex patterns in spatial datasets. As a result, they are capable of discovering clusters with varying shapes. However, most of the density-based algorithms are challenged by the discovery of clusters with varying density and the ability to cluster big datasets. The VDENCLUE algorithm was proposed to discover clusters with varying densities. However, VDENCLUE incurs high computation overhead, which is impractical for large datasets. In this paper, a parallel approximated variant of VDENCLUE is proposed, called MR-VDENCLUE. Besides discovering clusters with arbitrary shapes, MR-VDENCLUE can discover clusters with varying densities and scale up to handle big datasets.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available