4.6 Article

Convex clustering via l1 fusion penalization

出版社

OXFORD UNIV PRESS
DOI: 10.1111/rssb.12223

关键词

Convex clustering; Fusion penalties; Number of clusters; Rates of convergence

资金

  1. National Science Foundation [DMS-1209057]

向作者/读者索取更多资源

We study the large sample behaviour of a convex clustering framework, which minimizes the sample within cluster sum of squares under an l(1) fusion constraint on the cluster centroids. This recently proposed approach has been gaining in popularity; however, its asymptotic properties have remained mostly unknown. Our analysis is based on a novel representation of the sample clustering procedure as a sequence of cluster splits determined by a sequence of maximization problems. We use this representation to provide a simple and intuitive formulation for the population clustering procedure. We then demonstrate that the sample procedure consistently estimates its population analogue and we derive the corresponding rates of convergence. The proof conducts a careful simultaneous analysis of a collection of M-estimation problems, whose cardinality grows together with the sample size. On the basis of the new perspectives gained from the asymptotic investigation, we propose a key post-processing modification of the original clustering framework. We show, both theoretically and empirically, that the resulting approach can be successfully used to estimate the number of clusters in the population. Using simulated data, we compare the proposed method with existing number-of-clusters and modality assessment approaches and obtain encouraging results. We also demonstrate the applicability of our clustering method to the detection of cellular subpopulations in a single-cell virology study.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据