4.5 Article

Model Interpretation Through Lower-Dimensional Posterior Summarization

Journal

JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS
Volume 30, Issue 1, Pages 144-161

Publisher

TAYLOR & FRANCIS INC
DOI: 10.1080/10618600.2020.1796684

Keywords

Decision theory; Graphical summary; Interpretable machine learning; Nonparametric regression; Partial effects

Ask authors/readers for more resources

Nonparametric regression models have been gaining power and popularity due to the increasing size and complexity of datasets. However, these models can be difficult to interpret and may not meet the underlying inferential goals of analysts and decision makers. This article proposes a two-stage approach to create concise and interpretable summaries of complex models, allowing flexibility in choice of modeling techniques and inferential targets. A flexible model is first fit for accuracy, followed by the construction of lower-dimensional summaries with valid Bayesian uncertainty estimates.
Nonparametric regression models have recently surged in their power and popularity, accompanying the trend of increasing dataset size and complexity. While these models have proven their predictive ability in empirical settings, they are often difficult to interpret and do not address the underlying inferential goals of the analyst or decision maker. In this article, we propose a modular two-stage approach for creating parsimonious, interpretable summaries of complex models which allow freedom in the choice of modeling technique and the inferential target. In the first stage, a flexible model is fit which is believed to be as accurate as possible. In the second stage, lower-dimensional summaries are constructed by projecting draws from the distribution onto simpler structures. These summaries naturally come with valid Bayesian uncertainty estimates. Further, since we use the data only once to move from prior to posterior, these uncertainty estimates remain valid across multiple summaries and after iteratively refining a summary. We apply our method and demonstrate its strengths across a range of simulated and real datasets. The methods we present here are implemented in an R package available at github.com/spencerwoody/possum. for this article are available online.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available