☆ 4.3 Article

Evolutionary computing based hybrid bisecting clustering algorithm for multidimensional data

SADHANA-ACADEMY PROCEEDINGS IN ENGINEERING SCIENCES (2019)

期刊

SADHANA-ACADEMY PROCEEDINGS IN ENGINEERING SCIENCES

卷 44, 期 2, 页码 -

出版社

SPRINGER INDIA

DOI: 10.1007/s12046-018-1011-y

关键词

Bisecting K-Means; clustering; feature normalization; T-test analysis; modified genetic algorithm; multidimensional data sets

类别

Engineering, Multidisciplinary

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

The emerging technologies and data centric applications have been becoming an integral part of business intelligence, decision process and numerous daily activities. To enable efficient pattern classification and data analysis, clustering has emerged as a potential mechanism that classifies data elements based on respective feature homogeneity. Although K-Means clustering has exhibited appreciable performance for data clustering, it suffers to enable optimal classification with high dimensional data sets. Numerous optimization efforts including genetic algorithm (GA) based clustering also require further optimization to avoid local minima issues. In this paper, an improved Canonical GA based Bisecting K-Means algorithm (CGABC) has been developed. The proposed model incorporates min-max normalization based feature normalization of the high dimensional data sets, which is followed by T-Test analysis that significantly reduces data dimensions based on feature similarity of the data elements. The fitness value has been assigned based on inter-cluster (heterogeneous distance) and within-cluster (homogeneous distance) distances. To enable optimal features and process parameter selection, particularly cluster centers information, the conventional GA has been modified by applying multistage reproduction process, enhanced crossover and mutation. By incorporating the optimized cluster center information the Bisecting K-Means clustering has been performed, which has exhibited optimal solution for highly accurate and efficient clustering with high dimensional data sets.

Evolutionary computing based hybrid bisecting clustering algorithm for multidimensional data

期刊

SADHANA-ACADEMY PROCEEDINGS IN ENGINEERING SCIENCES

出版社

SPRINGER INDIA

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Evolutionary computing based hybrid bisecting clustering algorithm for multidimensional data

期刊

SADHANA-ACADEMY PROCEEDINGS IN ENGINEERING SCIENCES

出版社

SPRINGER INDIA

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文