4.5 Article

A hybrid similarity measure-based clustering approach for mixed attribute data

Publisher

SPRINGER HEIDELBERG
DOI: 10.1007/s13042-023-01968-6

Keywords

Clustering analysis; Mixed attributes; Hybrid similarity measure; Similarity mean; Allocation strategy

Ask authors/readers for more resources

In this paper, a new clustering approach for mixed attribute data is proposed, which effectively reduces the similarity difference and inclination of similarity measure superposition by using a hybrid similarity measure and a calculation formula of similarity mean. This approach avoids artificial setting of similarity threshold parameters.
In mixed attribute clustering, the similarity measure superposition is skewed due to the difference of measuring different attribute types. In this paper, a new clustering approach for mixed attribute data is proposed using hybrid similarity measure. Firstly, a hybrid similarity measure formula is defined using the information entropy, therefore the similarity difference among various attribute types is effectively reduced, and the inclination of similarity measure superposition is alleviated. Secondly, a calculation formula of similarity mean for mixed attributes is defined, which can describe the centralized trend of data distribution, and can be effectively used to merge of clustering clusters. Thus, artificial setting of similarity threshold parameters can be avoided. Thirdly, a novel clustering analysis algorithm for mixed attributes is proposed using hybrid similarity measure and allocation strategy of boundary data objects. In the end, experimental results validate that the algorithm performs well on clustering effect, scalability and anti-noise, as well as the stability and effectiveness of the similarity mean by using UCI, artificial data sets and stellar spectral data sets.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available