4.7 Article

DeepnsSNPs: Accurate prediction of non-synonymous single-nucleotide polymorphisms by combining multi-scale convolutional neural network and residue environment information

出版社

ELSEVIER
DOI: 10.1016/j.chemolab.2021.104326

关键词

Single nucleotide variants; Disease-causing nsSNPs; Multi-scale convolutional neural network; Residue environment information; Non-synonymous single-nucleotide poly-morphisms

资金

  1. National Natural Science Foundation of China [62072243, 61772273]
  2. Natural Science Foundation of Jiangsu [BK20201304]
  3. Natural Science Foundation of Anhui Province of China [KJ2018A0572]
  4. Foundation of National Defense Key Laboratory of Science and Technology [JZX7Y202001SY000901]

向作者/读者索取更多资源

A novel deep learning model MSCNN was proposed in this study, which utilized multi-scale convolution with different kernel sizes for feature processing and incorporated three types of nominal structural features to enhance nsSNPs prediction performance. Experimental results showed that DeepnsSNPs outperformed individual classifiers and consensus classifiers on three datasets, demonstrating its effectiveness in predicting nsSNPs.
Non-synonymous single-nucleotide polymorphisms (nsSNPs) is a typical kind of genetic variant, and more than 6000 diseases have been detected to be caused by nsSNPs. Accordingly, the accurate prediction of nsSNPs is of great importance for a better understanding of their functional mechanisms and disease treatment. Till now, many computational studies have been developed to identify disease-causing nsSNPs from the neutral ones; however, there is still some gap existing for further improvement in terms of overall prediction performance. In this work, we proposed a novel deep learning model, called multi-scale convolutional neural network (MSCNN). It utilized multi-scale convolution with different kernel sizes for feature processing, which can collect more effective characteristics than using a single convolution kernel size. Moreover, we applied three types of nominal structural features for further improving the nsSNPs prediction performance. Notably, the nsSNPs sequence and structural features were extracted based on the residue environment method we proposed, which has proved to be effective for protein nsSNPs prediction in our previous research. Based on the proposed MSCNN model and the extracted informative feature matrix, we implemented a new nsSNPs predictor, named DeepnsSNPs. The DeepnsSNPs was tested on three nsSNPs datasets collected from the PredictSNP1 website and achieved an average Matthews correlation coefficient of 0.507, which is 18.28% higher than the individual classifiers and 11.37% higher than the consensus classifier on average. Detailed dataset analyses have demonstrated that the DeepnsSNPs would be useful in the nsSNPs prediction. We provide the source python codes and benchmark datasets at htt ps://github.com/sera616/DeepnsSNPs.git for academic use.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据