☆ 4.7 Article

Prediction of Probable Major Depressive Disorder in the Taiwan Biobank: An Integrated Machine Learning and Genome-Wide Analysis Approach

JOURNAL OF PERSONALIZED MEDICINE (2021)

Journal

JOURNAL OF PERSONALIZED MEDICINE

Volume 11, Issue 7, Pages -

Publisher

MDPI

DOI: 10.3390/jpm11070597

Keywords

genome-wide association study; machine learning; major depressive disorder; personalized medicine; single nucleotide polymorphisms

Funding

Taiwan Ministry of Science and Technology [MOST 109-2634-F-075-001]
Taipei Veterans General Hospital [V108D44-001-MY3-1]

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

The study integrated machine learning and genome-wide analysis to predict probable major depressive disorder (MDD), identifying genes and SNPs associated with MDD and establishing prediction models, with random forests performing the best among the predictive algorithms.

In light of recent advancements in machine learning, personalized medicine using predictive algorithms serves as an essential paradigmatic methodology. Our goal was to explore an integrated machine learning and genome-wide analysis approach which targets the prediction of probable major depressive disorder (MDD) using 9828 individuals in the Taiwan Biobank. In our analysis, we reported a genome-wide significant association with probable MDD that has not been previously identified: FBN1 on chromosome 15. Furthermore, we pinpointed 17 single nucleotide polymorphisms (SNPs) which show evidence of both associations with probable MDD and potential roles as expression quantitative trait loci (eQTLs). To predict the status of probable MDD, we established prediction models with random undersampling and synthetic minority oversampling using 17 eQTL SNPs and eight clinical variables. We utilized five state-of-the-art models: logistic ridge regression, support vector machine, C4.5 decision tree, LogitBoost, and random forests. Our data revealed that random forests had the highest performance (area under curve = 0.8905 +/- 0.0088; repeated 10-fold cross-validation) among the predictive algorithms to infer complex correlations between biomarkers and probable MDD. Our study suggests that an integrated machine learning and genome-wide analysis approach may offer an advantageous method to establish bioinformatics tools for discriminating MDD patients from healthy controls.

Prediction of Probable Major Depressive Disorder in the Taiwan Biobank: An Integrated Machine Learning and Genome-Wide Analysis Approach

Journal

JOURNAL OF PERSONALIZED MEDICINE

Publisher

MDPI

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Prediction of Probable Major Depressive Disorder in the Taiwan Biobank: An Integrated Machine Learning and Genome-Wide Analysis Approach

Journal

JOURNAL OF PERSONALIZED MEDICINE

Publisher

MDPI

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper