4.6 Article

Identification of Human Membrane Protein Types by Incorporating Network Embedding Methods

期刊

IEEE ACCESS
卷 7, 期 -, 页码 140794-140805

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/ACCESS.2019.2944177

关键词

Proteins; Feature extraction; Biomembranes; Predictive models; Radio frequency; Classification algorithms; Decision trees; Membrane protein; network embedding method; random forest; integrated model

资金

  1. Natural Science Foundation of Shanghai [17ZR1412500]
  2. Science and Technology Commission of Shanghai Municipality (STCSM) [18dz2271000]

向作者/读者索取更多资源

Membrane protein is an important type of proteins and has been confirmed to play essential roles in various cellular processes. Based on their intramolecular arrangements and positions in a cell, they can be categorized into several types. However, it is time- and cost-consuming to recognize the type of a given membrane protein via traditional biophysical methods. In view of this, several computational models have been proposed in recent years. Most models adopted various information of membrane proteins, such as their sequences, domain profiles, physiochemical properties, etc. to extract different features, which were fed into downstream classification algorithms. In this study, we built two novel prediction models, which incorporated novel feature extraction methods, i.e., network embedding methods. To this end, several protein networks were constructed using the protein-protein interaction information retrieved from STRING. Among these models, one model was constructed based on features obtained by applying Mashup on seven protein networks, another model was built using features yielded by Node2Vec on one comprehensive protein network. Each model adopted random forest as the classification algorithm and employed the Synthetic Minority Over-sampling Technique (SMOTE) to overcome the influence yielded by the great difference on sizes of different membrane protein types. Furthermore, two models were integrated into one model to improve the predicted quality. The test results shown that the integrated model had good performance and was superior to any individual model. Also, we compared our models with some previous models, suggesting that our models were competitive.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据