4.4 Article

Improved Criteria for Clustering Based on the Posterior Similarity Matrix

期刊

BAYESIAN ANALYSIS
卷 4, 期 2, 页码 367-391

出版社

INT SOC BAYESIAN ANALYSIS
DOI: 10.1214/09-BA414

关键词

adjusted Rand index; cluster analysis; Dirichlet process mixture model; Markov chain Monte Carlo

资金

  1. Deutsche Forschungsgemeinschaft [SFB 475]

向作者/读者索取更多资源

In this paper we address the problem of obtaining a single clustering estimate (c) over cap based on an MCMC sample of clusterings c((1)),c((2))..., c((M)) from the posterior distribution of a Bayesian cluster model. Methods to derive (c) over cap when the number of groups K varies between the clusterings are reviewed and discussed. These include the maximum a posteriori (MAP) estimate and methods based on the posterior similarity matrix, a matrix containing the posterior probabilities that the observations i and j are in the same cluster. The posterior similarity matrix is related to a commonly used loss function by Binder (1978). Minimization of the loss is shown to be equivalent to maximizing the Randindex between estimated and true clustering. We propose new criteria for estimating a clustering, which are based on the posterior expected adjusted Rand index. The criteria are shown to possess a shrink age property and out perform Binder's loss in a simulation study and in an application to gene expression data. They also perform favorably compared to other clustering procedures.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.4
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据