4.6 Article

Predicting Interactions Between Pathogen and Human Proteins Based on the Relation Between Sequence Length and Amino Acid Composition

Journal

CURRENT BIOINFORMATICS
Volume 16, Issue 6, Pages 799-806

Publisher

BENTHAM SCIENCE PUBL LTD
DOI: 10.2174/1574893616666210430133846

Keywords

Protein-protein interaction; pathogen-host interaction; machine learning; Ebola; HCV; SARS-CoV-2; Y; pestis

Funding

  1. National Research Foundation of Korea (NRF) - Ministry of Science and ICT [NRF-2018K2A9A2A11080914]
  2. INHA University Research Grant

Ask authors/readers for more resources

Our study identified a linear relationship between amino acid composition and sequence length in proteins involved in PPIs between humans and bacteria, leading to the development of a support vector machine model that showed high performance in predicting PPIs. The model also demonstrated good performance in predicting PPIs between humans and viruses such as Ebola, HCV, and SARS-CoV-2, highlighting the potential for further research and application in identifying unknown target host proteins and designing experiments.
Aim: Both bacterial infection and viral infection involve a large number of protein-protein interactions (PPIs) between a pathogen and its target host. Background: So far, many computational methods have focused on predicting PPIs within the same species rather than PPIs across different species. Methods: From the extensive analysis of PPIs between Yersinia pestis bacteria and humans, we recently discovered an interesting relation; a linear relation between amino acid composition and sequence length was observed in many proteins involved in PPIs. We have built a support vector machine (SVM) model, which predicts PPIs between human and bacteria using two feature types derived from the relation. The two feature types used in the SVM are the amino acid composition group (AACG) and the difference in amino acid composition between host and pathogen proteins. Results: The SVM model achieved high performance in predicting bacteria-human PPIs. The model showed an accuracy of 96%, sensitivity of 94%, and specificity of 98% in predicting PPIs between humans and Yersinia pestis, in which there is a strong relation between amino acid composition and sequence length. The SVM model was also tested in predicting PPIs between human and viruses, which include Ebola, HCV, and SARS-CoV-2, and showed a good performance. Conclusion: The feature types identified in our study are simple yet powerful in predicting pathogenhuman PPIs. Although preliminary, our method will be useful for finding unknown target host proteins or pathogen proteins and designing in vitro or in vivo experiments.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available