4.7 Article

Diagnostic classification of cancers using extreme gradient boosting algorithm and multi-omics data

Journal

COMPUTERS IN BIOLOGY AND MEDICINE
Volume 121, Issue -, Pages -

Publisher

PERGAMON-ELSEVIER SCIENCE LTD
DOI: 10.1016/j.compbiomed.2020.103761

Keywords

Diagnostic classification; Machine learning; Extreme gradient boosting; Multi-omics data; Cancer

Funding

  1. National Natural Science Foundation of China [61471078, 81974439]

Ask authors/readers for more resources

Accurate diagnostic classification of cancers can greatly help physicians to choose surveillance and treatment strategies for patients. Following the explosive growth of huge amounts of biological data, the shift from traditional biostatistical methods to computer-aided means has made machine-learning methods as an integral part of today's cancer prognosis prediction. In this work, we proposed a classification model by leveraging the power of extreme gradient boosting (XGBoost) and using increasingly complex multi-omits data with the aim to separate early stage and late stage cancers. We applied XGBoost model to four kinds of cancer data downloaded from TCGA and compared its performance with other popular machine-learning methods. The experimental results showed that our method obtained statistically significantly better or comparable predictive performance. The results of this study also revealed that DNA methylation outperforms other molecular data (mRNA expression and miRNA expression) in terms of accuracy and stability for discriminating between early stage and late stage groups. Furthermore, integration of multi-omits data by autoencoder can enhance the classification accuracy of cancer stage. Finally, we conducted bioinformatics analyses to assess the medical utility of the significant genes ranked by their importance using XGBoost algorithm. Extensively comparative experiments demonstrated that the XGBoost method has a remarkable performance in predicting the stage of cancer patients with multi-omits data. Moreover, identification of novel candidate genes associated with cancer stages would contribute to further elucidate disease pathogenesis and develop novel therapeutics.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available