4.8 Article

Machine learning-based tissue of origin classification for cancer of unknown primary diagnostics using genome-wide mutation features

期刊

NATURE COMMUNICATIONS
卷 13, 期 1, 页码 -

出版社

NATURE PORTFOLIO
DOI: 10.1038/s41467-022-31666-w

关键词

-

资金

  1. Stichting Hanarth Fonds, The Netherlands

向作者/读者索取更多资源

This study demonstrates the use of whole-genome DNA sequencing and a machine learning model called Cancer of Unknown Primary Location Resolver to classify metastatic tumors, improving diagnosis and treatment decision-making.
Cancers of unknown primary (CUP) origin account for similar to 3% of all cancer diagnoses, whereby the tumor tissue of origin (TOO) cannot be determined. Using a uniformly processed dataset encompassing 6756 whole-genome sequenced primary and metastatic tumors, we develop Cancer of Unknown Primary Location Resolver (CUPLR), a random forest TOO classifier that employs 511 features based on simple and complex somatic driver and passenger mutations. CUPLR distinguishes 35 cancer (sub)types with similar to 90% recall and similar to 90% precision based on cross-validation and test set predictions. We find that structural variant derived features increase the performance and utility for classifying specific cancer types. With CUPLR, we could determine the TOO for 82/141 (58%) of CUP patients. Although CUPLR is based on machine learning, it provides a human interpretable graphical report with detailed feature explanations. The comprehensive output of CUPLR complements existing histopathological procedures and can enable improved diagnostics for CUP patients. The original tumor location can be unclear for metastatic tumors. Here, the authors show that DNA sequencing of whole genomes can be used to classify metastatic tumors using a machine learning model, Cancer of Unknown Primary Location Resolver, in order to improve diagnosis and inform treatment decisions.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.8
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据