☆ 4.7 Article

Clustering mixed-type data using a probabilistic distance algorithm

APPLIED SOFT COMPUTING (2022)

期刊

APPLIED SOFT COMPUTING

卷 130, 期 -, 页码 -

出版社

ELSEVIER

DOI: 10.1016/j.asoc.2022.109704

关键词

Probabilistic distance clustering; Mixed-type data; Fuzzy clustering

类别

Computer Science, Artificial Intelligence Computer Science, Interdisciplinary Applications

资金

San Jose State University Mathematics and Statistics department [3415040090]
Central RSCA of San Jose State University [18-RSG-08-046]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

This paper discusses a probabilistic distance clustering method adjusted for cluster size (PDQ) for handling mixed-type data, shows its advantages through a simulation design, and applies it to a real data set.

Cluster analysis is a broadly used unsupervised data analysis technique for finding groups of homoge-neous units in a data set. Probabilistic distance clustering adjusted for cluster size (PDQ), discussed in this contribution, falls within the broad category of clustering methods initially developed to deal with continuous data; it has the advantage of fuzzy membership and robustness. However, a common issue in clustering deals with treating mixed-type data: continuous and categorical, which are among the most common types of data. This paper extends PDQ for mixed-type data using different dissimilarities for different kinds of variables. At first, the PDQ for mixed-type data is defined, then a simulation design shows its advantages compared to some state of the art techniques, and ultimately, it is used on a real data set. The conclusion includes some future developments.(c) 2022 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).

Clustering mixed-type data using a probabilistic distance algorithm

期刊

APPLIED SOFT COMPUTING

出版社

ELSEVIER

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Clustering mixed-type data using a probabilistic distance algorithm

期刊

APPLIED SOFT COMPUTING

出版社

ELSEVIER

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文