4.6 Article

m5Cpred-XS: A New Method for Predicting RNA m5C Sites Based on XGBoost and SHAP

Journal

FRONTIERS IN GENETICS
Volume 13, Issue -, Pages -

Publisher

FRONTIERS MEDIA SA
DOI: 10.3389/fgene.2022.853258

Keywords

5-cytosine-methylation; XGBoost; machine learning; shap; feature selection

Funding

  1. National Natural Science Foundation of China [21403002]

Ask authors/readers for more resources

In this study, a new method called m5Cpred-XS was proposed to predict m5C sites in three different organisms. The method utilized powerful feature selection and machine learning algorithms to train the models, and its superiority was confirmed through comparison with other methods. A web server was also deployed for easy access to the model, making it a useful tool for studying m5C sites.
As one of the most important post-transcriptional modifications of RNA, 5-cytosine-methylation (m5C) is reported to closely relate to many chemical reactions and biological functions in cells. Recently, several computational methods have been proposed for identifying m5C sites. However, the accuracy and efficiency are still not satisfactory. In this study, we proposed a new method, m5Cpred-XS, for predicting m5C sites of H. sapiens, M. musculus, and A. thaliana. First, the powerful SHAP method was used to select the optimal feature subset from seven different kinds of sequence-based features. Second, different machine learning algorithms were used to train the models. The results of five-fold cross-validation indicate that the model based on XGBoost achieved the highest prediction accuracy. Finally, our model was compared with other state-of-the-art models, which indicates that m5Cpred-XS is superior to other methods. Moreover, we deployed the model on a web server that can be accessed through , and m5Cpred-XS is expected to be a useful tool for studying m5C sites.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available