4.7 Article Proceedings Paper

Fast Prediction of Protein Methylation Sites Using a Sequence-Based Feature Selection Technique

Publisher

IEEE COMPUTER SOC
DOI: 10.1109/TCBB.2017.2670558

Keywords

Protein methylation site; machine learning based method; feature representation; feature selection technique

Funding

  1. National Natural Science Foundation of China [61370010, 31271405, 31671362]

Ask authors/readers for more resources

Protein methylation, an important post-translational modification, plays crucial roles in many cellular processes. The accurate prediction of protein methylation sites is fundamentally important for revealing the molecular mechanisms undergoing methylation. In recent years, computational prediction based on machine learning algorithms has emerged as a powerful and robust approach for identifying methylation sites, and much progress has been made in predictive performance improvement. However, the predictive performance of existing methods is not satisfactory in terms of overall accuracy. Motivated by this, we propose a novel random-forest-based predictor called MePred-RF, integrating several discriminative sequence-based feature descriptors and improving feature representation capability using a powerful feature selection technique. Importantly, unlike other methods based on multiple, complex information inputs, our proposed MePred-RF is based on sequence information alone. Comparative studies on benchmark datasets via vigorous jackknife tests indicate that our proposed MePred-RF method remarkably outperforms other state-of-the-art predictors, leading by a 4.5 percent average in terms of overall accuracy. A user-friendly webserver that implements the proposed method has been established for researchers' convenience, and is now freely available for public use through http://server.malab.cn/MePred-RF. We anticipate our research tool to be useful for the large-scale prediction and analysis of protein methylation sites.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available