4.6 Article

NONEXCHANGEABLE RANDOM PARTITION MODELS FOR MICROCLUSTERING

期刊

ANNALS OF STATISTICS
卷 49, 期 4, 页码 1931-1957

出版社

INST MATHEMATICAL STATISTICS-IMS
DOI: 10.1214/20-AOS2003

关键词

Power-law; random partitions; completely random measure; stochastic process; sparse random graph

资金

  1. EPSRC [EP/P026753/1, EP/L016710/1]
  2. ERC under the European Union's 7th Framework programme (FP7/2007-2013) ERC grant [617071]
  3. EPSRC [EP/P026753/1] Funding Source: UKRI

向作者/读者索取更多资源

This paper introduces a flexible class of nonexchangeable random partition models that can generate partitions with cluster sizes growing sublinearly with the sample size, controlled by one parameter. Experiments on real data sets highlight the usefulness of the approach compared to a two-parameter Chinese restaurant process.
Many popular random partition models, such as the Chinese restaurant process and its two-parameter extension, fall in the class of exchangeable random partitions, and have found wide applicability in various fields. While the exchangeability assumption is sensible in many cases, it implies that the size of the clusters necessarily grows linearly with the sample size, and such feature may be undesirable for some applications. We present here a flexible class of nonexchangeable random partition models, which are able to generate partitions whose cluster sizes grow sublinearly with the sample size, and where the growth rate is controlled by one parameter. Along with this result, we provide the asymptotic behaviour of the number of clusters of a given size, and show that the model can exhibit a power-law behaviour, controlled by another parameter. The construction is based on completely random measures and a Poisson embedding of the random partition, and inference is performed using a Sequential Monte Carlo algorithm. Experiments on real data sets emphasise the usefulness of the approach compared to a two-parameter Chinese restaurant process.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据