4.2 Article

Predicting dropout from psychological treatment using different machine learning algorithms, resampling methods, and sample sizes

期刊

PSYCHOTHERAPY RESEARCH
卷 33, 期 6, 页码 683-695

出版社

ROUTLEDGE JOURNALS, TAYLOR & FRANCIS LTD
DOI: 10.1080/10503307.2022.2161432

关键词

dropout prediction; machine learning; supervised learning; sample size; resampling methods; data imbalance

向作者/读者索取更多资源

This paper aims to improve dropout prediction in psychological interventions by comparing machine learning algorithms, sample sizes, and resampling methods. Results showed that resampling methods can enhance the performance of ML algorithms, with down-sampling being recommended as the fastest and most accurate method. Furthermore, a minimum sample size of 300 cases is necessary for optimal prediction accuracy.
Objective:The occurrence of dropout from psychological interventions is associated with poor treatment outcome and high health, societal and economic costs. Recently, machine learning (ML) algorithms have been tested in psychotherapy outcome research. Dropout predictions are usually limited by imbalanced datasets and the size of the sample. This paper aims to improve dropout prediction by comparing ML algorithms, sample sizes and resampling methods.Method:Twenty ML algorithms were examined in twelve subsamples (drawn from a sample of N = 49,602) using four resampling methods in comparison to the absence of resampling and to each other. Prediction accuracy was evaluated in an independent holdout dataset using the F-1-Measure.Results:Resampling methods improved the performance of ML algorithms and down-sampling can be recommended, as it was the fastest method and as accurate as the other methods. For the highest mean F-1-Score of .51 a minimum sample size of N = 300 was necessary. No specific algorithm or algorithm group can be recommended.Conclusion:Resampling methods could improve the accuracy of predicting dropout in psychological interventions. Down-sampling is recommended as it is the least computationally taxing method. The training sample should contain at least 300 cases.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.2
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据