4.3 Article

Comparison of Machine Learning Techniques for Mortality Prediction in a Prospective Cohort of Older Adults

出版社

MDPI
DOI: 10.3390/ijerph182312806

关键词

ageing; all-cause mortality; imbalanced data; machine learning; mortality prediction; older adults; prediction models

资金

  1. European Regional Development Fund (ERDF) under Ireland's European Structural and Investment Funds Programmes 2014-2020
  2. INTERREG Northern Periphery and Arctic (NPA) [95]
  3. Science Foundation Ireland [12/RC/2289-P2 INSIGHT-2, 13/RC/2077 CONNECT]
  4. ERDF
  5. Enterprise Ireland
  6. Department of Business, Enterprise and Innovation under the DTIF project HOLISTICS

向作者/读者索取更多资源

As global demographics shift, aging becomes a significant focus, with the application of proper prognostic indices in clinical decisions on mortality prediction becoming increasingly important. Machine learning can transform prognostic modeling, as shown in the development of machine learning models for all-cause mortality prediction in healthy older adults. Random undersampling with random forest proved to have the best results, although probability calibration slightly reduced average performance but increased model robustness.
As global demographics change, ageing is a global phenomenon which is increasingly of interest in our modern and rapidly changing society. Thus, the application of proper prognostic indices in clinical decisions regarding mortality prediction has assumed a significant importance for personalized risk management (i.e., identifying patients who are at high or low risk of death) and to help ensure effective healthcare services to patients. Consequently, prognostic modelling expressed as all-cause mortality prediction is an important step for effective patient management. Machine learning has the potential to transform prognostic modelling. In this paper, results on the development of machine learning models for all-cause mortality prediction in a cohort of healthy older adults are reported. The models are based on features covering anthropometric variables, physical and lab examinations, questionnaires, and lifestyles, as well as wearable data collected in free-living settings, obtained for the Healthy Ageing Initiative study conducted on 2291 recruited participants. Several machine learning techniques including feature engineering, feature selection, data augmentation and resampling were investigated for this purpose. A detailed empirical comparison of the impact of the different techniques is presented and discussed. The achieved performances were also compared with a standard epidemiological model. This investigation showed that, for the dataset under consideration, the best results were achieved with Random UnderSampling in conjunction with Random Forest (either with or without probability calibration). However, while including probability calibration slightly reduced the average performance, it increased the model robustness, as indicated by the lower 95% confidence intervals. The analysis showed that machine learning models could provide comparable results to standard epidemiological models while being completely data-driven and disease-agnostic, thus demonstrating the opportunity for building machine learning models on health records data for research and clinical practice. However, further testing is required to significantly improve the model performance and its robustness.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.3
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据