4.6 Article

Machine Learning Models for Predicting Adverse Pregnancy Outcomes in Pregnant Women with Systemic Lupus Erythematosus

期刊

DIAGNOSTICS
卷 13, 期 4, 页码 -

出版社

MDPI
DOI: 10.3390/diagnostics13040612

关键词

prediction; machine learning; systemic lupus erythematosus; SLE; pregnancy; gestation; random forest

向作者/读者索取更多资源

This study aimed to develop predictive models using machine learning techniques to explore more information from medical records of pregnant women with SLE for predicting adverse outcomes. After analysis and selection, 18 variables showed statistical differences and 40 variables were identified as contributing predictors. The Random Forest algorithm demonstrated the best discrimination ability for overall predictive models and achieved the best performance in real-time predictive accuracy assessment. Machine learning models could overcome the limitations of statistical methods in the presence of small sample sizes and numerous variables, and the RF classifier performed well in such structured medical records.
Predicting adverse outcomes is essential for pregnant women with systemic lupus erythematosus (SLE) to minimize risks. Applying statistical analysis may be limited for the small sample size of childbearing patients, while the informative medical records could be provided. This study aimed to develop predictive models applying machine learning (ML) techniques to explore more information. We performed a retrospective analysis of 51 pregnant women exhibiting SLE, including 288 variables. After correlation analysis and feature selection, six ML models were applied to the filtered dataset. The efficiency of these overall models was evaluated by the Receiver Operating Characteristic Curve. Meanwhile, real-time models with different timespans based on gestation were also explored. Eighteen variables demonstrated statistical differences between the two groups; more than forty variables were screened out by ML variable selection strategies as contributing predictors, while the overlap of variables were the influential indicators testified by the two selection strategies. The Random Forest (RF) algorithm demonstrated the best discrimination ability under the current dataset for overall predictive models regardless of the data missing rate, while Multi-Layer Perceptron models ranked second. Meanwhile, RF achieved best performance when assessing the real-time predictive accuracy of models. ML models could compensate the limitation of statistical methods when the small sample size problem happens along with numerous variables acquired, while RF classifier performed relatively best when applied to such structured medical records.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据