4.6 Article

Protein transfer learning improves identification of heat shock protein families

Journal

PLOS ONE
Volume 16, Issue 5, Pages -

Publisher

PUBLIC LIBRARY SCIENCE
DOI: 10.1371/journal.pone.0251865

Keywords

-

Funding

  1. National Research Foundation (NRF) of Korea - Ministry of Science and ICT [2018R1A2B3001628, 2014M3C9A3063541, 2019R1G1A1003253]
  2. Ministry of Agriculture, Food and Rural Affairs [918013-4]
  3. Brain Korea 21 Plus Project in 2021
  4. National Research Foundation of Korea [2019R1G1A1003253] Funding Source: Korea Institute of Science & Technology Information (KISTI), National Science & Technology Information Service (NTIS)

Ask authors/readers for more resources

This study introduces two novel deep learning algorithms, DeepHSP and DeeperHSP, for predicting HSPs. By leveraging convolutional neural networks (CNN) and protein transfer learning, DeeperHSP significantly outperforms state-of-the-art algorithms in cross-validation and independent test experiments.
Heat shock proteins (HSPs) play a pivotal role as molecular chaperones against unfavorable conditions. Although HSPs are of great importance, their computational identification remains a significant challenge. Previous studies have two major limitations. First, they relied heavily on amino acid composition features, which inevitably limited their prediction performance. Second, their prediction performance was overestimated because of the independent two-stage evaluations and train-test data redundancy. To overcome these limitations, we introduce two novel deep learning algorithms: (1) time-efficient DeepHSP and (2) high-performance DeeperHSP. We propose a convolutional neural network (CNN)-based DeepHSP that classifies both non-HSPs and six HSP families simultaneously. It outperforms state-of-the-art algorithms, despite taking 14-15 times less time for both training and inference. We further improve the performance of DeepHSP by taking advantage of protein transfer learning. While DeepHSP is trained on raw protein sequences, DeeperHSP is trained on top of pre-trained protein representations. Therefore, DeeperHSP remarkably outperforms state-of-the-art algorithms increasing F1 scores in both cross-validation and independent test experiments by 20% and 10%, respectively. We envision that the proposed algorithms can provide a proteome-wide prediction of HSPs and help in various downstream analyses for pathology and clinical research.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available