4.6 Article

Machine Learning Approaches to Classify Primary and Metastatic Cancers Using Tissue of Origin-Based DNA Methylation Profiles

期刊

CANCERS
卷 13, 期 15, 页码 -

出版社

MDPI
DOI: 10.3390/cancers13153768

关键词

DNA methylation; TCGA; biomarkers; clustering; differential methylation; metastasis; epigenetics; machine learning; artificial intelligence; explainable predictions

类别

资金

  1. Estonian Research Council [PRG1076]
  2. Enterprise Estonia [EU48695]
  3. Horizon 2020 innovation grant (ERIN) [EU952516]
  4. H2020 SoBigData++ project
  5. CHIST-ERA SAI project

向作者/读者索取更多资源

Cancer metastasis is a significant cause of cancer deaths, with DNA methylation changes playing a key role in cancer prediction. Machine learning techniques show great promise in cancer classification.
Simple Summary Cancer metastasis is considered to be one of the most significant causes of cancer morbidity, accounting for up to 90% of cancer deaths. The accurate identification of a cancer's origin and the types of cancer cells it comprises is crucial in enabling clinicians to decide better treatment options for patients. DNA methylation changes are increasingly recognized as determining cancer prediction, especially for the transition to metastasis. Research in the last decade has shown the incredible promise of the use of artificial intelligence (AI) in cancer classification. In this study, we applied several machine learning techniques, a branch of AI, to identify cancer tissue or origin and further classified cancer samples as primary and metastatic cancers based on publicly available DNA methylation data. Overall, our analysis resulted in a 99% accuracy for predicting cancer subtypes based on the tissue of origin. Metastatic cancers account for up to 90% of cancer-related deaths. The clear differentiation of metastatic cancers from primary cancers is crucial for cancer type identification and developing targeted treatment for each cancer type. DNA methylation patterns are suggested to be an intriguing target for cancer prediction and are also considered to be an important mediator for the transition to metastatic cancer. In the present study, we used 24 cancer types and 9303 methylome samples downloaded from publicly available data repositories, including The Cancer Genome Atlas (TCGA) and the Gene Expression Omnibus (GEO). We constructed machine learning classifiers to discriminate metastatic, primary, and non-cancerous methylome samples. We applied support vector machines (SVM), Naive Bayes (NB), extreme gradient boosting (XGBoost), and random forest (RF) machine learning models to classify the cancer types based on their tissue of origin. RF outperformed the other classifiers, with an average accuracy of 99%. Moreover, we applied local interpretable model-agnostic explanations (LIME) to explain important methylation biomarkers to classify cancer types.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据