☆ 3.8 Proceedings Paper

K-Medoid Clustering for Heterogeneous DataSets

PROCEEDINGS OF THE 4TH INTERNATIONAL CONFERENCE ON ECO-FRIENDLY COMPUTING AND COMMUNICATION SYSTEMS (2015)

Journal

PROCEEDINGS OF THE 4TH INTERNATIONAL CONFERENCE ON ECO-FRIENDLY COMPUTING AND COMMUNICATION SYSTEMS

Volume 70, Issue -, Pages 226-237

Publisher

ELSEVIER SCIENCE BV

DOI: 10.1016/j.procs.2015.10.077

Keywords

Clustering; Heterogeneous datasets; L-1 norm; K-Medoid; Probabilistic Computation

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

Recent years have explored various clustering strategies to partition datasets comprising of heterogeneous domains or types such as categorical, numerical and binary. Clustering algorithms seek to identify homogeneous groups of objects based on the values of their attributes. These algorithms either assume the attributes to be of homogeneous types or are converted into homogeneous types. However, datasets with heterogeneous data types are common in real life applications, which if converted, can lead to loss of information. This paper proposes a new similarity measure in the form of triplet to find the distance between two data objects with heterogeneous attribute types. A new k-medoid type of clustering algorithm is proposed by leveraging the similarity measure in the form of a vector. The proposed k-medoid type of clustering algorithm is compared with traditional clustering algorithms, based on cluster validation using Purity Index and Davies Bouldin index. Results show that the new clustering algorithm with new similarity measure outperforms the k-means clustering for mixed datasets. (C) 2015 The Authors. Published by Elsevier B.V.

K-Medoid Clustering for Heterogeneous DataSets

Journal

PROCEEDINGS OF THE 4TH INTERNATIONAL CONFERENCE ON ECO-FRIENDLY COMPUTING AND COMMUNICATION SYSTEMS

Publisher

ELSEVIER SCIENCE BV

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

K-Medoid Clustering for Heterogeneous DataSets

Journal

PROCEEDINGS OF THE 4TH INTERNATIONAL CONFERENCE ON ECO-FRIENDLY COMPUTING AND COMMUNICATION SYSTEMS

Publisher

ELSEVIER SCIENCE BV

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper