☆ 4.6 Review

Predicting population health with machine learning: a scoping review

BMJ OPEN (2020)

期刊

BMJ OPEN

卷 10, 期 10, 页码 -

出版社

BMJ PUBLISHING GROUP

DOI: 10.1136/bmjopen-2020-037860

关键词

public health; epidemiology; statistics & research methods

类别

Medicine, General & Internal

资金

Canadian Institutes of Health Research [FRN: 72054363]
Canada Research Chair in Population Health Analytics

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

Objective To determine how machine learning has been applied to prediction applications in population health contexts. Specifically, to describe which outcomes have been studied, the data sources most widely used and whether reporting of machine learning predictive models aligns with established reporting guidelines. Design A scoping review. Data sources MEDLINE, EMBASE, CINAHL, ProQuest, Scopus, Web of Science, Cochrane Library, INSPEC and ACM Digital Library were searched on 18 July 2018. Eligibility criteria We included English articles published between 1980 and 2018 that used machine learning to predict population-health-related outcomes. We excluded studies that only used logistic regression or were restricted to a clinical context. Data extraction and synthesis We summarised findings extracted from published reports, which included general study characteristics, aspects of model development, reporting of results and model discussion items. Results Of 22 618 articles found by our search, 231 were included in the review. The USA (n=71, 30.74%) and China (n=40, 17.32%) produced the most studies. Cardiovascular disease (n=22, 9.52%) was the most studied outcome. The median number of observations as 5414 (IQR=16 543.5) and the median number of features was 17 (IQR=31). Health records (n=126, 54.5%) and investigator-generated data (n=86, 37.2%) were the most common data sources. Many studies did not incorporate recommended guidelines on machine learning and predictive modelling. Predictive discrimination was commonly assessed using area under the receiver operator curve (n=98, 42.42%) and calibration was rarely assessed (n=22, 9.52%). Conclusions Machine learning applications in population health have concentrated on regions and diseases well represented in traditional data sources, infrequently using big data. Important aspects of model development were under-reported. Greater use of big data and reporting guidelines for Registration number Registered on the Open Science Framework on 17 July 2018 (available at https://osf.io/rnqe6/).predictive modelling could improve machine learning applications in population health.

Predicting population health with machine learning: a scoping review

期刊

BMJ OPEN

出版社

BMJ PUBLISHING GROUP

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Predicting population health with machine learning: a scoping review

期刊

BMJ OPEN

出版社

BMJ PUBLISHING GROUP

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文