4.2 Article

A Machine Learning Algorithm using Clinical and Demographic Data for All-Cause Preterm Birth Prediction

Journal

AMERICAN JOURNAL OF PERINATOLOGY
Volume -, Issue -, Pages -

Publisher

THIEME MEDICAL PUBL INC
DOI: 10.1055/s-0043-1776917

Keywords

preterm birth; machine learning; social determinants of health; XG- boost; predictive algorithm

Ask authors/readers for more resources

This study developed a predictive algorithm using machine learning to predict all-cause preterm birth based on clinical, demographic, and laboratory data. The results showed that these information can predict preterm birth with moderate precision.
Objective Preterm birth remains the predominant cause of perinatal mortality throughout the United States and the world, with well-documented racial and socioeconomic disparities. To develop and validate a predictive algorithm for all-cause preterm birth using clinical, demographic, and laboratory data using machine learning.Study Design We performed a cohort study of pregnant individuals delivering at a single institution using prospectively collected information on clinical conditions, patient demographics, laboratory data, and health care utilization. Our primary outcome was all-cause preterm birth before 37 weeks. The dataset was randomly divided into a derivation cohort (70%) and a separate validation cohort (30%). Predictor variables were selected amongst 33 that had been previously identified in the literature (directed machine learning). In the derivation cohort, both statistical (logistic regression) and machine learning (XG-Boost) models were used to derive the best fit (C-Statistic) and then validated using the validation cohort. We measured model discrimination with the C-Statistic and assessed the model performance and calibration of the model to determine whether the model provided clinical decision-making benefits.Results The cohort includes a total of 12,440 deliveries among 12,071 individuals. Preterm birth occurred in 2,037 births (16.4%). The derivation cohort consisted of 8,708 (70%) and the validation cohort consisted of 3,732 (30%). XG-Boost was chosen due to the robustness of the model and the ability to deal with missing data and collinearity between predictor variables. The top five predictor variables identified as drivers of preterm birth, by feature importance metric, were multiple gestation, number of emergency department visits in the year prior to the index pregnancy, initial unknown body mass index, gravidity, and prior preterm delivery. Test performance characteristics were similar between the two populations (derivation cohort area under the curve [AUC] = 0.70 vs. validation cohort AUC = 0.63).Conclusion Clinical, demographic, and laboratory information can be useful to predict all-cause preterm birth with moderate precision. Key Points Machine learning can be used to create models to predict preterm birth. In our model, all-cause preterm birth can be predicted with moderate precision. Clinical, demographic, and laboratory information can be useful to predict all-cause preterm birth.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.2
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available