☆ 4.5 Article

Cross-Study Replicability in Cluster Analysis

STATISTICAL SCIENCE (2023)

期刊

STATISTICAL SCIENCE

卷 38, 期 2, 页码 303-316

出版社

INST MATHEMATICAL STATISTICS-IMS

DOI: 10.1214/22-STS871

关键词

Clustering; replicability; multiple studies

类别

Statistics & Probability

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

In cancer research, clustering techniques are widely used for exploratory analyses, playing a critical role in the identification of novel cancer subtypes and patient management. Our paper reviews methods for replicability of clustering analyses and proposes a novel framework for evaluating cross-study clustering replicability. The approach can be applied to any clustering algorithm and can quantify replicability using different measures of similarity between partitions.

In cancer research, clustering techniques are widely used for ex-ploratory analyses, playing a critical role in the identification of novel cancer subtypes and patient management. As data collected by multiple research groups grows, it is increasingly feasible to investigate the replicability of clustering procedures, that is, their ability to consistently recover biologi-cally meaningful clusters across several data sets. In this paper, we review methods for replicability of clustering analyses, and discuss a novel frame-work for evaluating cross-study clustering replicability, useful when two or more studies are available. Our approach can be applied to any clustering al-gorithm and can employ different measures of similarity between partitions to quantify replicability, globally (i.e., for the whole sample) as well as lo-cally (i.e., for individual clusters). Using experiments on synthetic and real gene expression data, we illustrate the usefulness of our procedure to evalu-ate if the same clusters are identified consistently across a collection of data sets.

Cross-Study Replicability in Cluster Analysis

期刊

STATISTICAL SCIENCE

出版社

INST MATHEMATICAL STATISTICS-IMS

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Cross-Study Replicability in Cluster Analysis

期刊

STATISTICAL SCIENCE

出版社

INST MATHEMATICAL STATISTICS-IMS

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文