4.7 Article

predML-Site: Predicting Multiple Lysine PTM Sites With Optimal Feature Representation and Data Imbalance Minimization

Publisher

IEEE COMPUTER SOC
DOI: 10.1109/TCBB.2021.3114349

Keywords

Amino acids; Peptides; Encoding; Proteins; Feature extraction; Benchmark testing; Tools; Multi-label PTM site predictor; post-translational modifications; sequence-coupling; general PseAAC; k-spaced amino acid pairs; binary encoding; amino acid factor; support vector machine; ANOVA F test; incremental feature selection; data imbalance issue; different error costs; sequence analysis

Ask authors/readers for more resources

This paper presents a novel computational tool, predML-Site, for predicting post-translational modifications (PTMs) at lysine residues. The tool achieves high accuracy, targeting rate, and coverage rate in predicting multi-label PTM sites.
Identifying of post-translational modifications (PTM) is crucial in the study of computational proteomics, cell biology, pathogenesis, and drug development due to its role in many bio-molecular mechanisms. Computational methods for predicting multiple PTMat the same lysine residues, often referred to as K-PTM, is still evolving. This paper presents a novel computational tool, abbreviated as predML-Site, for predicting KPTM, such as acetylation, crotonylation, methylation, succinylation froman uncategorized peptide sample involving single, multiple, or no modification. For informative feature representation, multiple sequence encoding schemes, such as the sequence-coupling, binary encoding, k-spaced amino acid pairs, amino acid factor have been used with ANOVA and incremental feature selection. As a core predictor, a cost-sensitive SVMclassifier has been adopted which effectively mitigates the effect of class-label imbalance in the dataset. predML-Site predicts multi-label PTMsites with 84.18% accuracy using the top 91 features. It has also achieved 85.34% aiming and 86.58% coverage rate which are much better than the existing state-of-the-art predictors on the same rigorous validation test. This performance indicates that predML-Site can be used as a supportive tool for further K-PTM study. For the convenience of the experimental scientists, predML-Site has been deployed as a user-friendly web-server at http://103.99.176.239/predML-Site.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available