4.6 Article

Hierarchical Topology-Based Cluster Representation for Scalable Evolutionary Multiobjective Clustering

期刊

IEEE TRANSACTIONS ON CYBERNETICS
卷 52, 期 9, 页码 9846-9860

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TCYB.2021.3081988

关键词

Clustering algorithms; Optimization; Shape; Diversity methods; Clustering methods; Bipartite graph; Task analysis; Clustering; ensemble strategy; multiobjective optimization; number of clusters; representation learning

资金

  1. Natural Science Foundation of China [61973337]
  2. U.S. National Science Foundation's BEACON Center for the Study of Evolution in Action [DBI-0939454]
  3. China Scholarship Council

向作者/读者索取更多资源

Evolutionary multiobjective clustering algorithms can outperform single-object clustering algorithms when the number of clusters is not predetermined, but face challenges in computational burden. The proposed hierarchical, topology-based cluster representation simplifies the search procedure, leading to improved clustering performance and computing efficiency.
Evolutionary multiobjective clustering (MOC) algorithms have shown promising potential to outperform conventional single-objective clustering algorithms, especially when the number of clusters k is not set before clustering. However, the computational burden becomes a tricky problem due to the extensive search space and fitness computational time of the evolving population, especially when the data size is large. This article proposes a new, hierarchical, topology-based cluster representation for scalable MOC, which can simplify the search procedure and decrease computational overhead. A coarse-to-fine-trained topological structure that fits the spatial distribution of the data is utilized to identify a set of seed points/nodes, then a tree-based graph is built to represent clusters. During optimization, a bipartite graph partitioning strategy incorporated with the graph nodes helps in performing a cluster ensemble operation to generate offspring solutions more effectively. For the determination of the final result, which is underexplored in the existing methods, the usage of a cluster ensemble strategy is also presented, whether k is provided or not. Comparison experiments are conducted on a series of different data distributions, revealing the superiority of the proposed algorithm in terms of both clustering performance and computing efficiency.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据