Journal
IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS
Volume 26, Issue 6, Pages 2822-2829Publisher
IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/JBHI.2021.3137840
Keywords
Proteins; Training; Convolution; Bioinformatics; Amino acids; Support vector machines; Stress; Adenine-5'-triphosphate; ATP-binding protein; deep convolutional neural network; residue-residue contact information
Categories
Funding
- National Natural Science Foundations of China [62002181, 62061035]
- Self-topic/Open Project of Ecological Big Data Engineering Research Center of the Ministry of Education
Ask authors/readers for more resources
This study developed a novel method called DeepRCI for predicting ATP-binding proteins, achieving high accuracy through experiments and model selection. The comparison of residue-residue contact information datasets showed that high noise levels can reduce prediction accuracy, but this problem is expected to be solved with an increase in sequence data.
Adenine-5'-triphosphate (ATP) is a direct energy source for various activities of tissues and cells in the body. The release of ATP energies requires the assistance of ATP-binding proteins. Therefore, the identification of ATP-binding proteins is of great significance for the research on organisms. So far, there are several methods for predicting ATP-binding proteins. However, the accuracies of these methods are so low that the predicted proteins are inaccurate. Here, we designed a novel method, called as DeepRCI (based on Deep convolutional neural network and Residue-residue Contact Information), for predicting ATP-binding proteins. In order to maximize the performance of our method, we experimented with different hyperparameters and finally chose a 12-depth-512-filters deep convolutional neural network with an input size of 448*448. By using this model, DeepRCI achieved an accuracy of 93.61% on the test set which means a significant improvement of 11.78% over the state-of-the-art methods. We also compared the performance of residue-residue contact information datasets with different noise levels which are mainly due to gaps in the multiple sequence alignment. Compared with the low-noise dataset, the prediction accuracy on the high-noise dataset is reduced by 6.78%, which affects the performance of DeepRCI to a certain extent. We believe that with the increase of sequence data, this problem will eventually be solved. Finally, we provide a web service of DeepRCI which link can be obtained in Data Availability.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available