4.7 Article

An in silico approach to identification, categorization and prediction of nucleic acid binding proteins

Journal

BRIEFINGS IN BIOINFORMATICS
Volume 22, Issue 3, Pages -

Publisher

OXFORD UNIV PRESS
DOI: 10.1093/bib/bbaa171

Keywords

DNA-binding proteins; RNA-binding proteins; classification; gene ontology; PNIDB

Funding

  1. Natural Science Foundation of China [61902259, 61771331]
  2. Natural Science Foundation of Guangdong Province [2018A0303130084]
  3. Science and Technology Innovation Commission of Shenzhen [JCYJ20170818100431895]

Ask authors/readers for more resources

Exploring the function of proteins in protein-nucleic acid interactions is important for understanding related biological events and predicting these interactions. Establishing databases by collecting and identifying protein sequence information helps in predicting protein function, leading to improved prediction accuracy.
The interaction between proteins and nucleic acid plays an important role in many processes, such as transcription, translation and DNA repair. The mechanisms of related biological events can be understood by exploring the function of proteins in these interactions. The number of known protein sequences has increased rapidly in recent years, but the databases for describing the structure and function of protein have unfortunately grown quite slowly. Thus, improving such databases is meaningful for predicting protein-nucleic acid interactions. Furthermore, the mechanism of related biological events, such as viral infection or designing novel drug targets, can be further understood by understanding the function of proteins in these interactions. The information for each sequence, including its function and interaction sites, were collected and identified, and a database called PNIDB was built. The proteins in PNIDB were grouped into 27 classes, such as transcription, immune system, and structural protein, etc. The function of each protein was then predicted using a machine learning method. Using our method, the predictor was trained on labeled sequences, and then the function of a protein was predicted based on the trained classifier. The prediction accuracy achieved a score of 77.43% by 10-fold cross validation.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available