4.4 Article

Posterior Calibration of Posterior Predictive p Values

Journal

PSYCHOLOGICAL METHODS
Volume 22, Issue 2, Pages 382-396

Publisher

AMER PSYCHOLOGICAL ASSOC
DOI: 10.1037/met0000142

Keywords

goodness-of-fit; posterior predictive p value; calibration; regression analysis; latent class analysis

Funding

  1. Netherlands Organization for Scientific Research (NWO) [VICI] [453-10-002]
  2. Netherlands Organization for Scientific Research (NWO) [VENI] [451-13-011]

Ask authors/readers for more resources

In order to accurately control the Type I error rate (typically .05), a p value should be uniformly distributed under the null model. The posterior predictive p value (ppp), which is commonly used in Bayesian data analysis, generally does not satisfy this property. For example there have been reports where the sampling distribution of the ppp under the null model was highly concentrated around .50. In this case, a ppp of .20 would indicate model misfit, but when comparing it with a significance level of .05, which is standard statistical practice, the null model would not be rejected. Therefore, the ppp has very little power to detect model misfit. A solution has been proposed in the literature, which involves calibrating the ppp using the prior distribution of the parameters under the null model. A disadvantage of this prior-cppp is that it is very sensitive to the prior of the model parameters. In this article, an alternative solution is proposed where the ppp is calibrated using the posterior under the null model. This posterior-cppp (a) can be used when prior information is absent, (b) allows one to test any type of misfit by choosing an appropriate discrepancy measure, and (c) has a uniform distribution under the null model. The methodology is applied in various testing problems: testing independence of dichotomous variables, checking misfit of linear regression models in the presence of outliers, and assessing misfit in latent class analysis. Translational Abstract In psychological research we are often interested in better understanding underlying relationships between variables of interest or in identifying unobserved subgroups of respondents (e.g., depressed patients). For this purpose, researchers can build statistical models that mimic these relationships or latent subgroups. After observing the data, the statistical model can be applied to the data and the relationships can be found or the subgroups can be understood. This can only be done adequately if the correct statistical is model is used. It is therefore crucial to thoroughly check how well the model fits the data. The posterior predictive p value (ppp), a Bayesian statistical criterion, has been found useful. The strength of the ppp is that it can be used to check virtually any type of model misfit. A disadvantage is that the ppp may not detect poor fit for a statistical model, thus yielding incorrect conclusions about relationships or latent subgroups of interest. This article resolves this issue. First the ppp is calibrated given the observed data producing a hypothetical distribution of possible ppp's which could reasonably be observed if the model would be correct. Second, the observed ppp is compared with the hypothetical distribution of ppp's to check if the observed ppp was likely to be generated from the employed model. Simulation studies show that the new calibrated ppp is better at detecting model misfit than the original ppp. Therefore, the calibrated ppp identifies more appropriate statistical models providing more reliable conclusions about underlying relationships or latent subgroups of interest.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.4
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available