4.6 Article

Deciphering the Methylation Landscape in Breast Cancer: Diagnostic and Prognostic Biosignatures through Automated Machine Learning

Journal

CANCERS
Volume 13, Issue 7, Pages -

Publisher

MDPI
DOI: 10.3390/cancers13071677

Keywords

breast cancer; methylation; machine learning; signature; predictive model; bioinformatics; pathway; transcription

Categories

Funding

  1. project DNA methylation as a minimally-invasive biomarker: development and validation of classifiers with prognostic and/or predictive clinical value in breast cancer therapy under the call for proposals EDULLL 103 [MIS 5049913]
  2. European Union (European Social Fund-ESF) by the Operational Programme Human Resources Development, Education and Lifelong Learning 2014-2020

Ask authors/readers for more resources

The study revealed the important role of DNA methylation in breast cancer pathogenesis and constructed three high-performing gene signatures for the discrimination, identification and early diagnosis of breast cancer. The selected genes in the signatures were associated with breast cancer-related pathways, highlighting the significance of gene methylation events in breast carcinogenesis.
Simple Summary Breast cancer (BrCa) is characterized by aberrant DNA methylation. We leveraged high-throughput methylation data from BrCa and normal breast tissues and identified 11,176 to 27,786 differentially methylated genes (DMGs) against clinically relevant end-points. Innovative automated machine learning was employed to construct three highly performing signatures for (1) the discrimination of BrCa patients from healthy individuals, (2) the identification of BrCa metastatic disease and (3) the early diagnosis of BrCa. Furthermore, functional analysis revealed that most genes selected in the signatures showed associations to BrCa, with regulation of transcription being the main biological process, the nucleus being the main cellular component and transcription factor activity and sequence-specific DNA binding being the main molecular functions. Overall, revisiting methylome datasets led to three high-performance signatures that are readily available for improving BrCa precision management and significant knowledge mining related to disease pathophysiology. DNA methylation plays an important role in breast cancer (BrCa) pathogenesis and could contribute to driving its personalized management. We performed a complete bioinformatic analysis in BrCa whole methylome datasets, analyzed using the Illumina methylation 450 bead-chip array. Differential methylation analysis vs. clinical end-points resulted in 11,176 to 27,786 differentially methylated genes (DMGs). Innovative automated machine learning (AutoML) was employed to construct signatures with translational value. Three highly performing and low-feature-number signatures were built: (1) A 5-gene signature discriminating BrCa patients from healthy individuals (area under the curve (AUC): 0.994 (0.982-1.000)). (2) A 3-gene signature identifying BrCa metastatic disease (AUC: 0.986 (0.921-1.000)). (3) Six equivalent 5-gene signatures diagnosing early disease (AUC: 0.973 (0.920-1.000)). Validation in independent patient groups verified performance. Bioinformatic tools for functional analysis and protein interaction prediction were also employed. All protein encoding features included in the signatures were associated with BrCa-related pathways. Functional analysis of DMGs highlighted the regulation of transcription as the main biological process, the nucleus as the main cellular component and transcription factor activity and sequence-specific DNA binding as the main molecular functions. Overall, three high-performance diagnostic/prognostic signatures were built and are readily available for improving BrCa precision management upon prospective clinical validation. Revisiting archived methylomes through novel bioinformatic approaches revealed significant clarifying knowledge for the contribution of gene methylation events in breast carcinogenesis.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available