4.6 Article

Identification of Human Membrane Protein Types by Incorporating Network Embedding Methods

Journal

IEEE ACCESS
Volume 7, Issue -, Pages 140794-140805

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/ACCESS.2019.2944177

Keywords

Proteins; Feature extraction; Biomembranes; Predictive models; Radio frequency; Classification algorithms; Decision trees; Membrane protein; network embedding method; random forest; integrated model

Funding

  1. Natural Science Foundation of Shanghai [17ZR1412500]
  2. Science and Technology Commission of Shanghai Municipality (STCSM) [18dz2271000]

Ask authors/readers for more resources

Membrane protein is an important type of proteins and has been confirmed to play essential roles in various cellular processes. Based on their intramolecular arrangements and positions in a cell, they can be categorized into several types. However, it is time- and cost-consuming to recognize the type of a given membrane protein via traditional biophysical methods. In view of this, several computational models have been proposed in recent years. Most models adopted various information of membrane proteins, such as their sequences, domain profiles, physiochemical properties, etc. to extract different features, which were fed into downstream classification algorithms. In this study, we built two novel prediction models, which incorporated novel feature extraction methods, i.e., network embedding methods. To this end, several protein networks were constructed using the protein-protein interaction information retrieved from STRING. Among these models, one model was constructed based on features obtained by applying Mashup on seven protein networks, another model was built using features yielded by Node2Vec on one comprehensive protein network. Each model adopted random forest as the classification algorithm and employed the Synthetic Minority Over-sampling Technique (SMOTE) to overcome the influence yielded by the great difference on sizes of different membrane protein types. Furthermore, two models were integrated into one model to improve the predicted quality. The test results shown that the integrated model had good performance and was superior to any individual model. Also, we compared our models with some previous models, suggesting that our models were competitive.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available