4.7 Article

Prediction of Antiviral peptides using transform evolutionary & SHAP analysis based descriptors by incorporation with ensemble learning strategy

出版社

ELSEVIER
DOI: 10.1016/j.chemolab.2022.104682

关键词

Antiviral peptides; PSSM-DWT; PSSM-Segmentation; SHAP analysis; Ensemble classification

向作者/读者索取更多资源

The paper introduces an intelligent and computationally efficient learning approach for the reliable identification of AVPs. Novel evolutionary descriptors are explored and optimal features are selected using SHAP based global interpretation analysis. The model achieves a higher classification rate through the use of five different classifiers and an ensemble learner.
Viral diseases are a major health concern in the last few years. Antiviral peptides (AVPs) belong to a type of antimicrobial peptides (AMPs) that have the high potential to defend the human body from various viral diseases. Despite the large production of antiviral vaccination and drugs, viral infections are still a prominent human disease. The discovery of AVPs as an antiviral agent offers an effective way to treat virus-affected cells. Recently, the development of peptide-based therapeutic agents via machine learning methods is becoming a major area of interest due to its promising results. In this paper, we developed an intelligent and computationally efficient learning approach for the reliable identification of AVPs. The novel evolutionary descriptors are explored via embedding discrete wavelet transform and k-segmentation approaches into the position-specific scoring matrix. Moreover, the Shapley Additive explanations (SHAP) based global interpretation analysis is employed to choose optimal features by measuring the contributions of each feature in the extracted vectors. In the next phase, the selected feature spaces are examined using five different classifiers, such as XGBoost (XGB), k-nearest neighbor (KNN), Extra Trees classifier (ETC), Support Vector Machine (SVM), and Adaboost (ADA). Furthermore, to boost the discriminative power of the proposed model, the predicted labels of all classifiers are given to the optimized genetic algorithm to build an ensemble learner. Hence, our proposed study reported a higher classification rate of 97.33% and 95.57% via training samples and independent samples, respectively. Which is similar to 5% improved accuracy than available predictors. It is recommended that our model will be a helpful approach for the researchers and may perform a significant role in research academia and drug development.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据