4.6 Article

Computational prediction of diagnosis and feature selection on mesothelioma patient health records

期刊

PLOS ONE
卷 14, 期 1, 页码 -

出版社

PUBLIC LIBRARY SCIENCE
DOI: 10.1371/journal.pone.0208737

关键词

-

资金

  1. Peter Munk Cardiac Centre

向作者/读者索取更多资源

Background Mesothelioma is a lung cancer that kills thousands of people worldwide annually, especially those with exposure to asbestos. Diagnosis of mesothelioma in patients often requires time-consuming imaging techniques and biopsies. Machine learning can provide for a more effective, cheaper, and faster patient diagnosis and feature selection from clinical data in patient records. Methods and findings We analyzed a dataset of health records of 324 patients having mesothelioma symptoms from Turkey. The patients had prior asbestos exposure and displayed symptoms consistent with mesothelioma. We compared probabilistic neural network, perceptron-based neural network, random forest, one rule, and decision tree classifiers to predict diagnosis of the patient records. We measured classifiers' performance through standard confusion matrix scores such as Matthews correlation coefficient (MCC). Random forest outperformed all models tried, obtaining MCC = +0.37 on the complete imbalanced dataset and MCC = +0.64 on the under-sampled balanced dataset. We then employed random forest feature selection to identify the two most relevant dataset traits associated with mesothelioma: lung side and platelet count. These two risk factors resulted so predictive, that decision tree focusing on them achieved the second top accuracy on the complete dataset diagnosis prediction (MCC = +0.28), outperforming all other methods and even decision tree itself applied to all features. Conclusions Our results show that machine learning can predict diagnoses of patients having mesothelioma symptoms with high accuracy, sensitivity, and specificity, in few minutes. Additionally, random forest can efficiently select the most important features of this clinical dataset (lung side and platelet count) in few seconds. The importance of pleural plaques in lung sides and blood platelets in mesothelioma diagnosis indicates that physicians should focus on these two features when reading records of patients with mesothelioma symptoms. Moreover, doctors can exploit our machinery to predict patient diagnosis when only lung side and platelet data are available.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据