4.4 Article

Fitting Prediction Rule Ensembles to Psychological Research Data: An Introduction and Tutorial

Journal

PSYCHOLOGICAL METHODS
Volume 25, Issue 5, Pages 636-652

Publisher

AMER PSYCHOLOGICAL ASSOC
DOI: 10.1037/met0000256

Keywords

R; recursive partitioning; decision making; machine learning

Funding

  1. Swiss National Science Foundation [IZK0Z1_175531]
  2. Swiss National Science Foundation (SNF) [IZK0Z1_175531] Funding Source: Swiss National Science Foundation (SNF)

Ask authors/readers for more resources

Prediction rule ensembles (PREs) are a relatively new statistical learning method, which aim to strike a balance between predictive performance and interpretability. Starting from a decision tree ensemble, like a boosted tree ensemble or a random forest, PREs retain a small subset of tree nodes in the final predictive model. These nodes can be written as simple rules of the form if [condition] then [prediction]. As a result, PREs are often much less complex than full decision tree ensembles, while they have been found to provide similar predictive performance in many situations. The current article introduces the methodology and shows how PREs can be fitted using the R package pre through several real-data examples from psychological research. The examples also illustrate a number of features of package pre that may be particularly useful for applications in psychology: support for categorical, multivariate and count responses, application of (non)negativity constraints, inclusion of confirmatory rules and standardized variable importance measures. Translational Abstract This manuscript presents prediction rule ensemble (PRE) methodology. This is a relatively new nonparametric exploratory regression method, which has been found to provide predictive performance close to that of modern machine-learning algorithms like random forests, while the fitted model consists of a small number of rules and predictor variables. These rules are statements of the form if [condition] then [prediction], which are relatively easy to interpret by human decision makers (e.g., psychologists, medical doctors). These rules can be used for identifying persons or subgroups at higher or lower risk for a given disorder, for example: if [gender = male & age > 55 & symptom A is present] then [log-odds of having the disorder + 5]. The current paper introduces PRE methodology, shows how PREs can be fitted using the R package pre and how the results can be interpreted. This is shown through three real-data examples from psychological research: predicting chronic depressive trajectories, predicting academic achievement among first-year psychology students and predicting last-week substance use in a randomized clinical trial. The examples also serve to illustrate features of package pre that may be particularly useful for applications in psychology, for example its support for categorical, multivariate and count responses, and the possibility of identifying high- or low-risk subgroups only.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.4
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available