Journal
BRIEFINGS IN BIOINFORMATICS
Volume 23, Issue 5, Pages -Publisher
OXFORD UNIV PRESS
DOI: 10.1093/bib/bbac174
Keywords
toxins; toxicity; machine learning; prediction; BLAST; motifs; proteins
Funding
- Department of Science and Technology (DST-INSPIRE)
- DBT-RA program in Biotechnology and Life Sciences
Ask authors/readers for more resources
The study presents ToxinPred2, a web-based tool for predicting the toxicity of proteins. Utilizing similarity, motif search, and machine-learning techniques, the tool achieves high accuracy in predicting protein toxicity. It is applicable to proteins from any source, making it a valuable resource in various research fields.
Proteins/peptides have shown to be promising therapeutic agents for a variety of diseases. However, toxicity is one of the obstacles in protein/peptide-based therapy. The current study describes a web-based tool, ToxinPred2, developed for predicting the toxicity of proteins. This is an update of ToxinPred developed mainly for predicting toxicity of peptides and small proteins. The method has been trained, tested and evaluated on three datasets curated from the recent release of the SwissProt. To provide unbiased evaluation, we performed internal validation on 80% of the data and external validation on the remaining 20% of data. We have implemented the following techniques for predicting protein toxicity; (i) Basic Local Alignment Search Tool-based similarity, (ii) Motif-EmeRging and with Classes-Identification-based motif search and (iii) Prediction models. Similarity and motif-based techniques achieved a high probability of correct prediction with poor sensitivity/coverage, whereas models based on machine-learning techniques achieved balance sensitivity and specificity with reasonably high accuracy. Finally, we developed a hybrid method that combined all three approaches and achieved a maximum area under receiver operating characteristic curve around 0.99 with Matthews correlation coefficient 0.91 on the validation dataset. In addition, we developed models on alternate and realistic datasets. The best machine learning models have been implemented in the web server named 'ToxinPred2', which is available at https://webs.iiitd.edu.in/raghava/toxinpred2/ and a standalone version at https://github.com/raghavagps/toxinpred2. This is a general method developed for predicting the toxicity of proteins regardless of their source of origin.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available