4.7 Article

iAtbP-Hyb-EnC: Prediction of antitubercular peptides via heterogeneous feature representation and genetic algorithm based ensemble learning model

Journal

COMPUTERS IN BIOLOGY AND MEDICINE
Volume 137, Issue -, Pages -

Publisher

PERGAMON-ELSEVIER SCIENCE LTD
DOI: 10.1016/j.compbiomed.2021.104778

Keywords

Antitubercular peptides; Ensemble classification; Genetic algorithm; One-hot encoding; Composite physiochemical properties; k-fold cross-validation test

Ask authors/readers for more resources

This study proposes the use of an intelligent and reliable prediction model for identifying antitubercular peptides, using an ensemble learning approach to improve results. The approach aims to compensate for the shortcomings of individual classification algorithms and show promising results in accurately predicting tuberculosis.
Tuberculosis (TB) is a worldwide illness caused by the bacteria Mycobacterium tuberculosis. Owing to the high prevalence of multidrug-resistant tuberculosis, numerous traditional strategies for developing novel alternative therapies have been presented. The effectiveness and dependability of these procedures are not always consistent. Peptide-based therapy has recently been regarded as a preferable alternative due to its excellent selectivity in targeting specific cells without affecting the normal cells. However, due to the rapid growth of the peptide samples, predicting TB accurately has become a challenging task. To effectively identify antitubercular peptides, an intelligent and reliable prediction model is indispensable. An ensemble learning approach was used in this study to improve expected results by compensating for the shortcomings of individual classification algorithms. Initially, three distinct representation approaches were used to formulate the training samples: k-space amino acid composition, composite physiochemical properties, and one-hot encoding. The feature vectors of the applied feature extraction methods are then combined to generate a heterogeneous vector. Finally, utilizing individual and heterogeneous vectors, five distinct nature classification models were used to evaluate prediction rates. In addition, a genetic algorithm-based ensemble model was used to improve the suggested model's prediction and training capabilities. Using Training and independent datasets, the proposed ensemble model achieved an accuracy of 94.47% and 92.68%, respectively. It was observed that our proposed iAtbP-Hyb-EnC model outperformed and reported similar to 10% highest training accuracy than existing predictors. The iAtbP-Hyb-EnC model is suggested to be a reliable tool for scientists and might play a valuable role in academic research and drug discovery. The source code and all datasets are publicly available at https://github.com/Farman335/iAtbP-Hyb-EnC.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available