期刊
BMC BIOINFORMATICS
卷 18, 期 -, 页码 -出版社
BIOMED CENTRAL LTD
DOI: 10.1186/s12859-017-1715-8
关键词
SSBs (Single-stranded DNA-binding proteins); DSBs (Double-stranded DNA-binding proteins); Binding specificity; Protein sequence
类别
资金
- Science and Technology Research Key Project of Educational Department of Henan Province [16A520016, 17B520002, 17B520036]
- Key Project of Science and Technology Department of Henan Province [142102210056]
- National Natural Science Foundation of China [61402153]
- China Postdoctoral Science Foundation
- Ph.D. Research Startup Foundation of Henan Normal University [qd15130, qd15132, qd15129]
Background: DNA-binding proteins perform important functions in a great number of biological activities. DNA-binding proteins can interact with ssDNA (single-stranded DNA) or dsDNA (double-stranded DNA), and DNA-binding proteins can be categorized as single-stranded DNA-binding proteins (SSBs) and double-stranded DNA-binding proteins (DSBs). The identification of DNA-binding proteins from amino acid sequences can help to annotate protein functions and understand the binding specificity. In this study, we systematically consider a variety of schemes to represent protein sequences: OAAC (overall amino acid composition) features, dipeptide compositions, PSSM (position-specific scoring matrix profiles) and split amino acid composition (SAA), and then we adopt SVM (support vector machine) and RF (random forest) classification model to distinguish SSBs from DSBs. Results: Our results suggest that some sequence features can significantly differentiate DSBs and SSBs. Evaluated by 10 fold cross-validation on the benchmark datasets, our prediction method can achieve the accuracy of 88.7% and AUC (area under the curve) of 0.919. Moreover, our method has good performance in independent testing. Conclusions: Using various sequence-derived features, a novel method is proposed to distinguish DSBs and SSBs accurately. The method also explores novel features, which could be helpful to discover the binding specificity of DNA-binding proteins.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据