4.7 Article

Machine Learning-Based Prediction of Elevated PTH Levels Among the US General Population

期刊

JOURNAL OF CLINICAL ENDOCRINOLOGY & METABOLISM
卷 107, 期 12, 页码 3222-3230

出版社

ENDOCRINE SOC
DOI: 10.1210/clinem/dgac544

关键词

parathyroid hormone; hyperparathyroidism; machine learning; prediction model; NHANES

向作者/读者索取更多资源

This study aimed to develop a machine learning-based prediction model for elevated PTH levels in US adults. The results showed that the random forest, GBM, and SuperLearner models had the highest AUC, while the logistic regression model with splines had the best calibration performance.
Context Although elevated parathyroid hormone (PTH) levels are associated with higher mortality risks, the evidence is limited as to when PTH is expected to be elevated and thus should be measured among the general population. Objective This work aimed to build a machine learning-based prediction model of elevated PTH levels based on demographic, lifestyle, and biochemical data among US adults. Methods This population-based study included adults aged 20 years or older with a measurement of serum intact PTH from the National Health and Nutrition Examination Survey (NHANES) 2003 to 2006. We used the NHANES 2003 to 2004 cohort (n = 4096) to train 6 machine-learning prediction models (logistic regression with and without splines, lasso regression, random forest, gradient-boosting machines [GBMs], and SuperLearner). Then, we used the NHANES 2005 to 2006 cohort (n = 4112) to evaluate the model performance including area under the receiver operating characteristic curve (AUC). Results Of 8208 US adults, 753 (9.2%) showed PTH greater than 74 pg/mL. Across 6 algorithms, the highest AUC was observed among random forest (AUC [95% CI] = 0.79 [0.76-0.81]), GBM (AUC [95% CI] = 0.78 [0.75-0.81]), and SuperLearner (AUC [95% CI] = 0.79 [0.76-0.81]). The AUC improved from 0.69 to 0.77 when we added cubic splines for the estimated glomerular filtration rate (eGFR) in the logistic regression models. Logistic regression models with splines showed the best calibration performance (calibration slope [95% CI] = 0.96 [0.86-1.06]), while other algorithms were less calibrated. Among all covariates included, eGFR was the most important predictor of the random forest model and GBM. Conclusion In this nationally representative data in the United States, we developed a prediction model that potentially helps us to make accurate and early detection of elevated PTH in general clinical practice. Future studies are warranted to assess whether this prediction tool for elevated PTH would improve adverse health outcomes.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据