4.7 Article

Protein deep profile and model predictions for identifying the causal genes of male infertility based on deep learning

期刊

INFORMATION FUSION
卷 75, 期 -, 页码 70-89

出版社

ELSEVIER
DOI: 10.1016/j.inffus.2021.04.012

关键词

Data integration; Disease phenotype; Male infertility; Causal gene; Knowledge representation; Convolutional neural network; Manifold learning; Deep learning

资金

  1. National Natural Science Foundation of China [31472054]
  2. National Key Research and Development Program of China [2016YFC1000600]

向作者/读者索取更多资源

This study introduces the DPPCG framework to identify causal genes for specific disease phenotypes using deep learning computational modeling. By integrating heterogeneous biomedical big data, it creatively utilizes protein deep profiles and deep CNN models to predict causal genes of male infertility and associated pathological processes.
A principal task in dissecting the genetics of complex traits is to identify causal genes for disease phenotypes. Millions of genes have been sequenced in data-driven genomics era, but their causal relationships with disease phenotypes remain limited, due to the difficulty of elucidating underlying causal genes by laboratory based strategies. Here, we proposed an innovative deep learning computational modeling alternative (DPPCG framework) for identifying causal (coding) genes for a specific disease phenotype. In terms of male infertility, we introduced proteins as intermediate cell variables, leveraging integrated deep knowledge representations (Word2vec, ProtVec, Node2vec, and Space2vec) quantitatively represented as 'protein deep profiles'. We adopted deep convolutional neural network (CNN) classifier to model protein deep profiles relationships with male infertility, creatively training deep CNN models of single-label binary classification and multi label eight classification. We demonstrate the capabilities of DPPCG framework by integrating and fully harnessing the utility of heterogeneous biomedical big data, including literature, protein sequences, protein-protein interactions, gene expressions, and gene-phenotype relationships, and effective indirect prediction of 794 causal genes of male infertility and associated pathological processes. We present this research in an interactive 'Smart Protein' intelligent (demo) system (http://www.smartprotein.cloud/public/home). Researchers can benefit from our intelligent system by (i) accessing a shallow gene/protein-radar service involving research status and a knowledge graph-based vertical search; (ii) querying and downloading protein deep profile matrices; (iii) accessing intelligent recommendations for causal genes of male infertility and associated pathological processes, and references for model architectures, parameter settings, and training outputs; and (iv) carrying out personalized analysis such as online K-Means clustering.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据