Journal
JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS
Volume 30, Issue 1, Pages 144-161Publisher
TAYLOR & FRANCIS INC
DOI: 10.1080/10618600.2020.1796684
Keywords
Decision theory; Graphical summary; Interpretable machine learning; Nonparametric regression; Partial effects
Categories
Ask authors/readers for more resources
Nonparametric regression models have been gaining power and popularity due to the increasing size and complexity of datasets. However, these models can be difficult to interpret and may not meet the underlying inferential goals of analysts and decision makers. This article proposes a two-stage approach to create concise and interpretable summaries of complex models, allowing flexibility in choice of modeling techniques and inferential targets. A flexible model is first fit for accuracy, followed by the construction of lower-dimensional summaries with valid Bayesian uncertainty estimates.
Nonparametric regression models have recently surged in their power and popularity, accompanying the trend of increasing dataset size and complexity. While these models have proven their predictive ability in empirical settings, they are often difficult to interpret and do not address the underlying inferential goals of the analyst or decision maker. In this article, we propose a modular two-stage approach for creating parsimonious, interpretable summaries of complex models which allow freedom in the choice of modeling technique and the inferential target. In the first stage, a flexible model is fit which is believed to be as accurate as possible. In the second stage, lower-dimensional summaries are constructed by projecting draws from the distribution onto simpler structures. These summaries naturally come with valid Bayesian uncertainty estimates. Further, since we use the data only once to move from prior to posterior, these uncertainty estimates remain valid across multiple summaries and after iteratively refining a summary. We apply our method and demonstrate its strengths across a range of simulated and real datasets. The methods we present here are implemented in an R package available at github.com/spencerwoody/possum. for this article are available online.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available