4.6 Article

Using Naive Bayesian Analysis to Determine Imaging Characteristics of KRAS Mutations in Metastatic Colon Cancer

期刊

DIAGNOSTICS
卷 7, 期 3, 页码 -

出版社

MDPI
DOI: 10.3390/diagnostics7030050

关键词

naive Bayesian classification; radiogenomics; RAS mutation; machine learning; natural language processing

资金

  1. National Institutes of Health [R01-HL137193, R01-EB24403, R21-EB021148, R03-CA172738]
  2. Mayo Clinic

向作者/读者索取更多资源

Genotype, particularly Ras status, greatly affects prognosis and treatment of liver metastasis in colon cancer patients. This pilot aimed to apply word frequency analysis and a naive Bayes classifier on radiology reports to extract distinguishing imaging descriptors of wild-type colon cancer patients and those with v-Ki-ras2 Kirsten rat sarcoma viral oncogene homolog (KRAS) mutations. In this institutional-review-board-approved study, we compiled a SNaPshot mutation analysis dataset from 457 colon adenocarcinoma patients. From this cohort of patients, we analyzed radiology reports of 299 patients (> 32,000 reports) who either were wild-type (147 patients) or had a KRAS (152 patients) mutation. Our algorithm determined word frequency within the wild-type and mutant radiology reports and used a naive Bayes classifier to determine the probability of a given word belonging to either group. The classifier determined that words with a greater than 50% chance of being in the KRAS mutation group and which had the highest absolute probability difference compared to the wild-type group included: several, innumerable, confluent, and numerous (p < 0.01). In contrast, words with a greater than 50% chance of being in the wild type group and with the highest absolute probability difference included: few, discrete, and [no] recurrent (p = 0.03). Words used in radiology reports, which have direct implications on disease course, tumor burden, and therapy, appear with differing frequency in patients with KRAS mutations versus wild-type colon adenocarcinoma. Moreover, likely characteristic imaging traits of mutant tumors make probabilistic word analysis useful in identifying unique characteristics and disease course, with applications ranging from radiology and pathology reports to clinical notes.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据