4.7 Article

FRTpred: A novel approach for accurate prediction of protein folding rate and type

Journal

COMPUTERS IN BIOLOGY AND MEDICINE
Volume 149, Issue -, Pages -

Publisher

PERGAMON-ELSEVIER SCIENCE LTD
DOI: 10.1016/j.compbiomed.2022.105911

Keywords

Protein folding rate; Folding type; Bioinformatics; Machine learning; Probabilistic features; Sequence analysis

Funding

  1. National Research Foundation of Korea (NRF) - Korean government (MSIT) [2017R1E1A1A01077717, 2021R1A2C1014338]
  2. National Research Foundation of Korea [2017R1E1A1A01077717] Funding Source: Korea Institute of Science & Technology Information (KISTI), National Science & Technology Information Service (NTIS)

Ask authors/readers for more resources

Protein folding rate is crucial for understanding the protein folding process and designing proteins. This study presents FRTpred, a novel approach that accurately predicts the logarithmic protein folding rate constant and folding type from the provided sequence. FRTpred outperforms existing methods and can accelerate the characterization of protein data.
Protein folding rate is an important property that is essential for understanding the protein folding process and is helpful for designing proteins. Predicting such properties from either sequence or structural information is a challenging task in bioinformatics. Although several computational methods have been developed in the past, only one sequence-based method is publicly available that shows limited accuracy when evaluated using a standardized independent dataset. This study proposes a novel approach, called FRTpred, that simultaneously predicts the logarithmic protein folding rate constant, ln(kf), and folding type from the provided sequence. First, 30 baseline models (regression models for ln(kf) and classification models for folding type) were constructed by integrating 10 representative feature extraction methods and three commonly used machine-learning algorithms. Subsequently, the predicted values of the 30 baseline models were combined and inputted into the random forest algorithm to construct the final prediction model. Cross-validation analysis showed that FRTpred achieved mean absolute deviations of 1.491, 2.016, and 1.954 for non-two-state, two-state, and combined models, respectively, when predicting ln(kf). Moreover, FRTpred predicts the folding type with an accuracy of 0.843. Performance comparisons based on independent tests against existing methods showed that FRTpred is more precise for both ln(kf) and folding type prediction. Thus, FRTpred is a powerful tool that may accelerate the characterization of the foldomics protein data and further inspire the development of next-generation predictors. The proposed model is available in the form of a web server that is freely accessible at http://thegleelab.org/FRTpred.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available