4.7 Article

Improved subspace clustering algorithm using multi-objective framework and subspace optimization

期刊

EXPERT SYSTEMS WITH APPLICATIONS
卷 158, 期 -, 页码 -

出版社

PERGAMON-ELSEVIER SCIENCE LTD
DOI: 10.1016/j.eswa.2020.113487

关键词

Subspace clustering; Multi-objective Optimization (MOO); Intra-Cluster Compactness (ICC); Feature Non-Redundancy (FNR); Feature Per Cluster (FPC)

资金

  1. Visvesvaraya PhD scheme for Electronics and IT
  2. Ministry of Electronics and Information Technology (MeitY), Government of India

向作者/读者索取更多资源

Subspace clustering technique divides the data set into different groups or clusters where each cluster comprises of objects that share some similar properties. Again, the feature sets or the subspace features that are used to represent clusters are different for different clusters. Moreover, in subspace clustering, the grouping of similar objects and the subspace feature set representing that group are identified simultaneously. In evolutionary-based machine learning problems, two critical measures to determine the quality of the generated clusters are compactness within and separation between the clusters. However, the distance-based separation between two clusters may not be useful in the context of subspace clustering, as the clusters may belong to two different subspaces. Again, in the case of subspace clustering, the selection of relevant subspace features plays a primary role in generating good quality subspace clusters. Therefore, the proposed approach optimizes the subspace features by considering two new objective functions, feature non-redundancy (FNR) and feature per cluster (FPC) represented in the form of PSMindex. Another objective function, intra-cluster compactness (ICC-index), is modified and used to optimize the compactness among objects within the cluster. Finally, an evolutionary-based multi-objective subspace clustering technique is developed in this paper optimizing these validity indices. A new mutation operator, namely duplication and deletion along with the modified version of the exogenous genetic material uptake, are developed to explore the search space effectively. The developed algorithm is tested on sixteen synthetic data sets and seven standard real-life data sets for identifying different subspace clusters. Again, to show the effectiveness of using multiple objectives, the algorithm is also tested on three big data sets and a MNIST data set. Also, an application of the proposed method is shown in biclustering the gene expression data. The results obtained by the proposed algorithm are compared against some state-of-the-art methods. Experimentation reveals that the proposed algorithm can take advantage of its evolvable genomic structure and the newly defined objective functions on the multi objective based framework. (c) 2020 Elsevier Ltd. All rights reserved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据