4.7 Article

Accurate prediction of multi-label protein subcellular localization through multi-view feature learning with RBRL classifier

Journal

BRIEFINGS IN BIOINFORMATICS
Volume 22, Issue 5, Pages -

Publisher

OXFORD UNIV PRESS
DOI: 10.1093/bib/bbab012

Keywords

multi-view information fusion; wMLDAb dimension reduction; RBRL classifier; multi-label protein subcellular localization

Funding

  1. National Natural Science Foundation of China [61863010]
  2. Key Research and Development Program of Shandong Province of China [2019GGX101001]
  3. Natural Science Foundation of Shandong Province of China [ZR2018MC007]
  4. Key Laboratory Open Foundation of Hainan Province [JSKX202001]

Ask authors/readers for more resources

This article introduces a prediction method called Mps-mvRBRL for multi-label protein subcellular localization, achieving high prediction accuracy for different types of bacteria through feature fusion and weighted linear discriminant analysis.
Multi-label proteins can participate in carrier transportation, enzyme catalysis, hormone regulation and other life activities. Meanwhile, they play a key role in the fields of biopharmaceuticals, gene and cell therapy. This article proposes a prediction method called Mps-mvRBRL to predict the subcellular localization (SCL) of multi-label protein. Firstly, pseudo position-specific scoring matrix, dipeptide composition, position specific scoring matrix-transition probability composition, gene ontology and pseudo amino acid composition algorithms are used to obtain numerical information from different views. Based on the contribution of five individual feature extraction methods, differential evolution is used for the first time to learn the weight of single feature, and then these original features use a weighted combination method to fuse multi-view information. Secondly, the fused high-dimensional features use a weighted linear discriminant analysis framework based on binary weight form to eliminate irrelevant information. Finally, the best feature vector is input into the joint ranking support vector machine and binary relevance with robust low-rank learning classifier to predict the SCL. After applying leave-one-out cross-validation, the overall actual accuracy (OAA) and overall location accuracy (OLA) of Mps-mvRBRL on the training set of Gram-positive bacteria are both 99.81%. The OAA on the test sets of plant, virus and Gram-negative bacteria datasets are 97.24%, 98.55% and 98.20%, respectively, and the OLA are 97.16%, 97.62% and 98.28%, respectively. The results show that the model achieves good prediction performance for predicting the SCL of multi-label protein.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available