4.7 Article

An Ensemble Framework for Improving the Prediction of Deleterious Synonymous Mutation

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TCSVT.2021.3063145

关键词

Training; Predictive models; Splicing; Bioinformatics; Logistics; Encoding; Diseases; Deleterious synonymous mutation; ensemble predictor; machine learning; pathogenicity prediction

资金

  1. National Natural Science Foundation of China [62072003, 61672037, 11835014, U19A2064, 61973295]
  2. National Key Research and Development Program of China [2020YFA0908700]
  3. Recruitment Program for Leading Talent Team of Anhui Province 2016-19
  4. Shanghai Municipal Science and Technology Major Project [2018SHZDZX01]
  5. Fundamental Research Funds for the Central Universities [2242021R10097]
  6. Zhangjiang Lab

向作者/读者索取更多资源

In this study, a new method called EnDSM was proposed to accurately predict deleterious synonymous mutations by combining multiple features and machine learning algorithms. The results showed that EnDSM outperformed other predictors on both training and testing datasets. This research is of great importance for the prediction of synonymous mutations in the field of medical genomics.
In recent years, the association between synonymous mutations (SMs) and human diseases has been uncovered in many studies. It is a challenge for identifying deleterious SMs in the field of medical genomics. Although there are several computational methods proposed in the past years, the precise prediction of deleterious SMs is still challenging. In this work, we proposed a predictor named as EnDSM, which is an accurate method based on the ensemble framework. We explored multimodal features across four groups including functional score, conservation, splicing, and sequence features, and we then trained eight conceptually different machine learning classifiers for each of them, resulting in 32 base classification models. We further selected four base models referring to their prediction performance and the predictive probabilities of these base classification models were subsequently used as the input feature vectors of logistic regression classifier to construct the ensemble learning model. The results suggested that EnDSM achieved better performance comparing with other state-of-the-art predictors on the training and independent test datasets. We anticipate that our ensemble predictor EnDSM will become a valuable tool for deleterious SM prediction. The EnDSM server interface along with the benchmarking data sets are freely available at http://bioinfo.ahu.edu.cn/EnDSM.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据