4.5 Article

Gibbs Priors for Bayesian Nonparametric Variable Selection with Weak Learners

Journal

JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS
Volume 32, Issue 3, Pages 1046-1059

Publisher

TAYLOR & FRANCIS INC
DOI: 10.1080/10618600.2022.2142594

Keywords

Bayesian nonparametrics; Machine learning; Model selection/variable selection; Nonparametric regression

Funding

  1. National Science Foundation [DMS-2144933]

Ask authors/readers for more resources

This article investigates the problem of high-dimensional Bayesian nonparametric variable selection using an aggregation of weak learners. The authors propose a solution by inducing sparsity in ensembles of weak learners through the use of Gibbs distributions and show the advantages of this approach.
We consider the problem of high-dimensional Bayesian nonparametric variable selection using an aggregation of so-called weak learners. The most popular variant of this is the Bayesian additive regression trees (BART) model, which is the natural Bayesian analog to boosting decision trees. In this article, we use Gibbs distributions on random partitions to induce sparsity in ensembles of weak learners. Looking at BART as a special case, we show that the class of Gibbs priors includes two recently proposed models- the Dirichlet additive regression trees (DART) model and the spike-and-forest model-as extremal cases, and we show that certain Gibbs priors are capable of achieving the benefits of both the DART and spike and-forest models while avoiding some of their key drawbacks. We then show the promising performance of Gibbs priors for other classes of weak learners, such as tensor products of spline basis functions. A Polya Urn scheme is developed for efficient computations. Supplementary materials for this article are available online.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available