期刊
BRIEFINGS IN BIOINFORMATICS
卷 22, 期 6, 页码 -出版社
OXFORD UNIV PRESS
DOI: 10.1093/bib/bbab336
关键词
protein-DNA interactions; DNA-binding residues; A-DNA; B-DNA; single-stranded DNA; double-stranded DNA; prediction
资金
- Robert J. Mattauch Endowment funds
- National Natural Science Foundation of China [61802329]
- Innovation Team Support Plan of University Science and Technology of Henan Province [19IRT-STHN014]
- Nanhu Scholars Program for Young Scholars of the Xinyang Normal University
Efforts to elucidate protein-DNA interactions at the molecular level rely on accurate predictions of DNA-binding residues in protein sequences. DNAgenie, a new predictor utilizing a custom-designed machine learning architecture, outperforms current methods in predicting residue-level interactions with A-DNA, B-DNA, and single-stranded DNA, reducing cross-predictions and generating promising leads for potential DNA-binding proteins.
Efforts to elucidate protein-DNA interactions at the molecular level rely in part on accurate predictions of DNA-binding residues in protein sequences. While there are over a dozen computational predictors of the DNA-binding residues, they are DNA-type agnostic and significantly cross-predict residues that interact with other ligands as DNA binding. We leverage a custom-designed machine learning architecture to introduce DNAgenie, first-of-its-kind predictor of residues that interact with A-DNA, B-DNA and single-stranded DNA. DNAgenie uses a comprehensive physiochemical profile extracted from an input protein sequence and implements a two-step refinement process to provide accurate predictions and to minimize the cross-predictions. Comparative tests on an independent test dataset demonstrate that DNAgenie outperforms the current methods that we adapt to predict residue-level interactions with the three DNA types. Further analysis finds that the use of the second (refinement) step leads to a substantial reduction in the cross predictions. Empirical tests show that DNAgenie's outputs that are converted to coarse-grained protein-level predictions compare favorably against recent tools that predict which DNA-binding proteins interact with double-stranded versus single-stranded DNAs. Moreover, predictions from the sequences of the whole human proteome reveal that the results produced by DNAgenie substantially overlap with the known DNA-binding proteins while also including promising leads for several hundred previously unknown putative DNA binders. These results suggest that DNAgenie is a valuable tool for the sequence-based characterization of protein functions. The DNAgenie's webserver is available at http://biomine.cs.vcu.edu/servers/DNAgenie/.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据