4.7 Article

TransPhos: A Deep-Learning Model for General Phosphorylation Site Prediction Based on Transformer-Encoder Architecture

Journal

Publisher

MDPI
DOI: 10.3390/ijms23084263

Keywords

phosphorylation site prediction; transformer; post-translational modifications

Funding

  1. National Natural Science Foundation of China [61873280, 61873281, 61972416]
  2. Natural Science Foundation of Shandong Province [ZR2019MF012]

Ask authors/readers for more resources

Protein phosphorylation is a critical post-translational modification in eukaryotes, and predicting phosphorylation sites accurately is challenging. This article introduces a new deep learning-based predictor called TransPhos, which combines a transformer encoder and densely connected convolutional neural network blocks. Experimental results show that TransPhos outperforms other deep learning models and prediction tools, achieving good performance in predicting phosphorylation sites on training datasets.
Protein phosphorylation is one of the most critical post-translational modifications of proteins in eukaryotes, which is essential for a variety of biological processes. Plenty of attempts have been made to improve the performance of computational predictors for phosphorylation site prediction. However, most of them are based on extra domain knowledge or feature selection. In this article, we present a novel deep learning-based predictor, named TransPhos, which is constructed using a transformer encoder and densely connected convolutional neural network blocks, for predicting phosphorylation sites. Data experiments are conducted on the datasets of PPA (version 3.0) and Phospho. ELM. The experimental results show that our TransPhos performs better than several deep learning models, including Convolutional Neural Networks (CNN), Long-term and short-term memory networks (LSTM), Recurrent neural networks (RNN) and Fully connected neural networks (FCNN), and some state-of-the-art deep learning-based prediction tools, including GPS2.1, NetPhos, PPRED, Musite, PhosphoSVM, SKIPHOS, and DeepPhos. Our model achieves a good performance on the training datasets of Serine (S), Threonine (T), and Tyrosine (Y), with AUC values of 0.8579, 0.8335, and 0.6953 using 10-fold cross-validation tests, respectively, and demonstrates that the presented TransPhos tool considerably outperforms competing predictors in general protein phosphorylation site prediction.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available