4.5 Article

Fast Search and Estimation of Bayesian Nonparametric Mixture Models Using a Classification Annealing EM Algorithm

Journal

JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS
Volume 30, Issue 1, Pages 236-247

Publisher

TAYLOR & FRANCIS INC
DOI: 10.1080/10618600.2020.1807995

Keywords

Bayesian nonparametrics; Clustering; Completely random measures; Density estimation; Normalized random measures

Funding

  1. NSF [SES-1156372]

Ask authors/readers for more resources

The article introduces a new fast-search algorithm for Bayesian nonparametric (BNP) infinite-mixture models, which can handle a wide range of BNP priors and is efficient in processing large datasets.
Bayesian nonparametric (BNP) infinite-mixture models provide flexible and accurate density estimation, cluster analysis, and regression. However, for the posterior inference of such a model, MCMC algorithms are complex, often need to be tailor-made for different BNP priors, and are intractable for large datasets. We introduce a BNP classification annealing EM algorithm which employs importance sampling estimation. This new fast-search algorithm, for virtually any given BNP mixture model, can quickly and accurately calculate the posterior predictive density estimate (by posterior averaging) and the maximum a-posteriori clustering estimate (by simulated annealing), even for datasets containing millions of observations. The algorithm can handle a wide range of BNP priors because it primarily relies on the ability to generate prior samples. The algorithm can be fast because in each iteration, it performs a sampling step for the (missing) clustering of the data points, instead of a costly E-step; and then performs direct posterior calculations in the M-step, given the sampled (imputed) clustering. The new algorithm is illustrated and evaluated through BNP Gaussian mixture model analyses of benchmark simulated data and real datasets. MATLAB code for the new algorithm is provided in the.for this article are available online.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available