☆ 4.5 Article

SDGCCA: Supervised Deep Generalized Canonical Correlation Analysis for Multi-Omics Integration

JOURNAL OF COMPUTATIONAL BIOLOGY (2022)

期刊

JOURNAL OF COMPUTATIONAL BIOLOGY

卷 29, 期 8, 页码 892-907

出版社

MARY ANN LIEBERT, INC

DOI: 10.1089/cmb.2021.0598

关键词

Alzheimer's disease; canonical correlation analysis; deep neural networks; multi-omics; supervised learning

类别

Biochemical Research Methods Biotechnology & Applied Microbiology Computer Science, Interdisciplinary Applications Mathematical & Computational Biology Statistics & Probability

资金

Bio and Medical Technology Development Program of NRF - Korean government (MSIT) [NRF-2018M3C7A1054935]
Institute of Information and Communications Technology Planning and Evaluation (IITP) - Korean government (MSIT) [2019-0-01842]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

Integration of multi-omics data using the proposed supervised deep generalized canonical correlation analysis (SDGCCA) method improves phenotypic classification and biomarker identification. By considering complex/nonlinear cross-data correlations between multiple modalities, SDGCCA outperforms other methods in predicting Alzheimer's disease (AD) and discriminating early- and late-stage cancers. Additionally, SDGCCA enables feature selection and identifies important multi-omics biomarkers associated with AD.

Integration of multi-omics data provides opportunities for revealing biological mechanisms related to certain phenotypes. We propose a novel method of multi-omics integration called supervised deep generalized canonical correlation analysis (SDGCCA) for modeling correlation structures between nonlinear multi-omics manifolds that aims at improving the classification of phenotypes and revealing the biomarkers related to phenotypes. SDGCCA addresses the limitations of other canonical correlation analysis (CCA)-based models (such as deep CCA, deep generalized CCA) by considering complex/nonlinear cross-data correlations between multiple (>= 2) modalities. Although there are a few methods to learn nonlinear CCA projections for classifying phenotypes, they only consider two views. Methods extended to multiple views either do not perform classification or do not provide feature ranking. In contrast, SDGCCA is a nonlinear multi-view CCA projection method that performs classification and ranks features. When we applied SDGCCA in predicting patients with Alzheimer's disease (AD) and discrimination of early- and late-stage cancers, it outperformed other CCA-based and other supervised methods. In addition, we demonstrate that SDGCCA can be applied for feature selection to identify important multi-omics biomarkers. On applying AD data, SDGCCA identified clusters of genes in multi-omics data, well known to be associated with AD.

SDGCCA: Supervised Deep Generalized Canonical Correlation Analysis for Multi-Omics Integration

期刊

JOURNAL OF COMPUTATIONAL BIOLOGY

出版社

MARY ANN LIEBERT, INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

SDGCCA: Supervised Deep Generalized Canonical Correlation Analysis for Multi-Omics Integration

期刊

JOURNAL OF COMPUTATIONAL BIOLOGY

出版社

MARY ANN LIEBERT, INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文