4.6 Article

Data-Independent Acquisition Mass Spectrometry of EPS-Urine Coupled to Machine Learning: A Predictive Model for Prostate Cancer

Journal

ACS OMEGA
Volume -, Issue -, Pages -

Publisher

AMER CHEMICAL SOC
DOI: 10.1021/acsomega.2c05487

Keywords

-

Ask authors/readers for more resources

Prostate cancer (PCa) is the most common cancer in males. Current diagnostic methods based on PSA and DRE have limitations in specificity and sensitivity, and cannot distinguish aggressive and indolent PCa. This study analyzed prostate fluid urine samples of PCa and BPH patients, and identified differentially expressed proteins between the two groups. Through machine learning algorithms, a predictive model was built using proteins such as sema7A, SPARC, FT ratio, and prostate gland size, which could correctly predict disease conditions in 83% of validation samples.
Prostate cancer (PCa) is annually the most frequently diagnosed cancer in the male population. To date, the diagnostic path for PCa detection includes the dosage of serum prostate-specific antigen (PSA) and the digital rectal exam (DRE). However, PSA-based screening has insufficient specificity and sensitivity; besides, it cannot discriminate between the aggressive and indolent types of PCa. For this reason, the improvement of new clinical approaches and the discovery of new biomarkers are necessary. In this work, expressed prostatic secretion (EPS)-urine samples from PCa patients and benign prostatic hyperplasia (BPH) patients were analyzed with the aim of detecting differentially expressed proteins between the two analyzed groups. To map the urinary proteome, EPS-urine samples were analyzed by data-independent acquisition (DIA), a high-sensitivity method particularly suitable for detecting proteins at low abundance. Overall, in our analysis, 2615 proteins were identified in 133 EPS-urine specimens obtaining the highest proteomic coverage for this type of sample; of these 2615 proteins, 1670 were consistently identified across the entire data set. The matrix containing the quantified proteins in each patient was integrated with clinical parameters such as the PSA level and gland size, and the complete matrix was analyzed by machine learning algorithms (by exploiting 90% of samples for training/testing using a 10-fold cross-validation approach, and 10% of samples for validation). The best predictive model was based on the following components: semaphorin-7A (sema7A), secreted protein acidic and rich in cysteine (SPARC), FT ratio, and prostate gland size. The classifier could predict disease conditions (BPH, PCa) correctly in 83% of samples in the validation set. Data are available via ProteomeXchange with the identifier PXD035942.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available