4.7 Article

Prediction of prostate cancer biochemical recurrence by using discretization supports the critical contribution of the extra-cellular matrix genes

Journal

SCIENTIFIC REPORTS
Volume 13, Issue 1, Pages -

Publisher

NATURE PORTFOLIO
DOI: 10.1038/s41598-023-35821-1

Keywords

-

Ask authors/readers for more resources

This study proposes a methodology using discretization to predict biochemical recurrence of prostate cancer while optimizing the necessary variables. Discretization method can improve the prediction accuracy of biochemical recurrence and identify a subset of ten genes related to tissue structure. Adding a clinical biomarker, prostate specific antigen (PSA), enhances the prediction of biochemical recurrence.
Due to its complexity, much effort has been devoted to the development of biomarkers for prostate cancer that have acquired the utmost clinical relevance for diagnosis and grading. However, all of these advances are limited due to the relatively large percentage of biochemical recurrence (BCR) and the limited strategies for follow up. This work proposes a methodology that uses discretization to predict prostate cancer BCR while optimizing the necessary variables. We used discretization of RNA-seq data to increase the prediction of biochemical recurrence and retrieve a subset of ten genes functionally known to be related to the tissue structure. Equal width and equal frequency data discretization methods were compared to isolate the contribution of the genes and their interval of action, simultaneously. Adding a robust clinical biomarker such as prostate specific antigen (PSA) improved the prediction of BCR. Discretization allowed classifying the cancer patients with an accuracy of 82% on testing datasets, and 75% on a validation dataset when a five-bin discretization by equal width was used. After data pre-processing, feature selection and classification, our predictions had a precision of 71% (testing dataset: MSKCC and GSE54460) and 69% (Validation dataset: GSE70769) should the patients present BCR up to 24 months after their final treatment. These results emphasize the use of equal width discretization as a pre-processing step to improve classification for a limited number of genes in the signature. Functionally, many of these genes have a direct or expected role in tissue structure and extracellular matrix organization. The processing steps presented in this study are also applicable to other cancer types to increase the speed and accuracy of the models in diverse datasets.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available