3.9 Article

CAA-PPI: A Computational Feature Design to Predict Protein-Protein Interactions Using Different Encoding Strategies

Journal

AI
Volume 4, Issue 2, Pages 385-400

Publisher

MDPI
DOI: 10.3390/ai4020020

Keywords

machine learning; protein-protein interactions; encoding strategy; feature representation

Ask authors/readers for more resources

Protein-protein interactions (PPIs) play a crucial role in various biological processes and have become a key focus in system biology. They are essential for predicting protein function and drug ability. This article introduces a new feature representation method called CAA-PPI for extracting features from protein sequences and achieving high prediction accuracy in PPI analysis.
Protein-protein interactions (PPIs) are involved in an extensive variety of biological procedures, including cell-to-cell interactions, and metabolic and developmental control. PPIs are becoming one of the most important aims of system biology. PPIs act as a fundamental part in predicting the protein function of the target protein and the drug ability of molecules. An abundance of work has been performed to develop methods to computationally predict PPIs as this supplements laboratory trials and offers a cost-effective way of predicting the most likely set of interactions at the entire proteome scale. This article presents an innovative feature representation method (CAA-PPI) to extract features from protein sequences using two different encoding strategies followed by an ensemble learning method. The random forest methodwas used as a classifier for PPI prediction. CAA-PPI considers the role of the trigram and bond of a given amino acid with its nearby ones. The proposed PPI model achieved more than a 98% prediction accuracy with one encoding scheme and more than a 95% prediction accuracy with another encoding scheme for the two diverse PPI datasets, i.e., H. pylori and Yeast. Further, investigations were performed to compare the CAA-PPI approach with existing sequence-based methods and revealed the proficiency of the proposed method with both encoding strategies. To further assess the practical prediction competence, a blind test was implemented on five other species' datasets independent of the training set, and the obtained results ascertained the productivity of CAA-PPI with both encoding schemes.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

3.9
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available