4.5 Article

Comparing machine learning with case-control models to identify confirmed dengue cases

期刊

PLOS NEGLECTED TROPICAL DISEASES
卷 14, 期 11, 页码 -

出版社

PUBLIC LIBRARY SCIENCE
DOI: 10.1371/journal.pntd.0008843

关键词

-

资金

  1. National Health Research Institutes, Taiwan [MR-108-GP-14, NHRI-108A1-MRCO-0319191]
  2. Ministry of Science and Technology, Taiwan [MOST-103-2314-B-006-009-MY3, MOST-107-2923-B-006-001, MOST-108-2923-B-006-001]

向作者/读者索取更多资源

In recent decades, the global incidence of dengue has increased. Affected countries have responded with more effective surveillance strategies to detect outbreaks early, monitor the trends, and implement prevention and control measures. We have applied newly developed machine learning approaches to identify laboratory-confirmed dengue cases from 4,894 emergency department patients with dengue-like illness (DLI) who received laboratory tests. Among them, 60.11% (2942 cases) were confirmed to have dengue. Using just four input variables [age, body temperature, white blood cells counts (WBCs) and platelets], not only the state-of-the-art deep neural network (DNN) prediction models but also the conventional decision tree (DT) and logistic regression (LR) models delivered performances with receiver operating characteristic (ROC) curves areas under curves (AUCs) of the ranging from 83.75% to 85.87% [for DT, DNN and LR: 84.60% +/- 0.03%, 85.87% +/- 0.54%, 83.75% +/- 0.17%, respectively]. Subgroup analyses found all the models were very sensitive particularly in the pre-epidemic period. Pre-peak sensitivities (<35 weeks) were 92.6%, 92.9%, and 93.1% in DT, DNN, and LR respectively. Adjusted odds ratios examined with LR for low WBCs [<= 3.2 (x10(3)/mu L)], fever (>= 38 degrees C), low platelet counts [< 100 (x10(3)/mu L)], and elderly (>= 65 years) were 5.17 [95% confidence interval (CI): 3.96-6.76], 3.17 [95%CI: 2.74-3.66], 3.10 [95%CI: 2.44-3.94], and 1.77 [95%CI: 1.50-2.10], respectively. Our prediction models can readily be used in resource-poor countries where viral/serologic tests are inconvenient and can also be applied for real-time syndromic surveillance to monitor trends of dengue cases and even be integrated with mosquito/environment surveillance for early warning and immediate prevention/control measures. In other words, a local community hospital/clinic with an instrument of complete blood counts (including platelets) can provide a sentinel screening during outbreaks. In conclusion, the machine learning approach can facilitate medical and public health efforts to minimize the health threat of dengue epidemics. However, laboratory confirmation remains the primary goal of surveillance and outbreak investigation. Author summary Identifying dengue cases early is crucial but challenging for healthcare professionals. This challenge is increased during large epidemics and is a particular problem in non-endemic areas with limited experienced staff. To improve dengue diagnosis, we investigated how to exploit machine learning (ML)-based prediction models and identified four key variables [age, fever, white blood cell counts (WBCs), and platelet counts], which are compatible with clinical and epidemiological knowledge. With these variables, the ML prediction models [decision tree (DT), deep neural network (DNN)] and the logistic regression model developed for identifying laboratory-confirmed dengue cases produced areas under curve (AUCs) of the receiver operating characteristic (ROC) curves ranging from 83.75% to 85.87%. This implies that the prediction models may serve as a pivotal component of an integrated dengue surveillance system and they required only a single complete blood count (CBC) examination. The sensitivities, positive prediction values, and accuracies for major risk factors in the two machine learning models were close to those of the regression models. For future applications, the DNN models with superior performance can be employed at epidemic sites with adequate computer facilities, while the DT and regression models with interpretable prediction logic can be employed at sites with limited or no computer facilities. Artificial intelligence and clinical parameters identified from this study may aid when laboratories are overwhelmed, but should never replace laboratory confirmation.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据