4.7 Article

An adaptive mutual K-nearest neighbors clustering algorithm based on maximizing mutual information

Journal

PATTERN RECOGNITION
Volume 137, Issue -, Pages -

Publisher

ELSEVIER SCI LTD
DOI: 10.1016/j.patcog.2022.109273

Keywords

Mutual K-nearest neighbors; Adaptive clustering; Maximizing mutual information

Ask authors/readers for more resources

Clustering based on Mutual K-nearest Neighbors (CMNN) is a classical method that groups data into clusters. However, it has limitations regarding the parameter k and misidentification of noise points. To address these issues, we propose an adaptive improved CMNN algorithm (AVCMNN) consisting of the improved CMNN algorithm (VCMNN) and the adaptive VCMNN algorithm (AVCMNN). The experimental results show that VCMNN and AVCMNN outperform other classical and state-of-the-art clustering algorithms.
Clustering based on Mutual K-nearest Neighbors (CMNN) is a classical method of grouping data into different clusters. However, it has two well-known limitations: (1) the clustering results are very much dependent on the parameter k ; (2) CMNN assumes that noise points correspond to clusters of small sizes according to the Mutual K-nearest Neighbors (MKNN) criterion, but some data points in small size clusters are wrongly identified as noises. To address these two issues, we propose an adaptive improved CMNN algorithm (AVCMNN), which consists of two parts: (1) improved CMNN algorithm (abbreviated as VCMNN) and (2) adaptive VCMNN algorithm (abbreviated as AVCMNN). Specifically, the first part is VCMNN algorithm, we first reassign the data points in some small-size clusters by a novel voting strategy because some of them are wrongly identified as noise points, and the clustering results are improved. Then, the second part is AVCMNN, we use maximizing mutual information to construct an objective function to optimize the parameters of the proposed method and finally obtain the better parameters values and clustering results. We conduct extensive experiments on twenty datasets, including six synthetic datasets, ten UCI datasets, and four image datasets. The experimental results show that VCMNN and AVCMNN outperforms three classical algorithms (i.e., CMNN, DPC, and DBSCAN) and six state-of-theart (SOTA) clustering algorithms in most cases. (c) 2022 Elsevier Ltd. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available