4.7 Article

DNAgenie: accurate prediction of DNA-type-specific binding residues in protein sequences

期刊

BRIEFINGS IN BIOINFORMATICS
卷 22, 期 6, 页码 -

出版社

OXFORD UNIV PRESS
DOI: 10.1093/bib/bbab336

关键词

protein-DNA interactions; DNA-binding residues; A-DNA; B-DNA; single-stranded DNA; double-stranded DNA; prediction

资金

  1. Robert J. Mattauch Endowment funds
  2. National Natural Science Foundation of China [61802329]
  3. Innovation Team Support Plan of University Science and Technology of Henan Province [19IRT-STHN014]
  4. Nanhu Scholars Program for Young Scholars of the Xinyang Normal University

向作者/读者索取更多资源

Efforts to elucidate protein-DNA interactions at the molecular level rely on accurate predictions of DNA-binding residues in protein sequences. DNAgenie, a new predictor utilizing a custom-designed machine learning architecture, outperforms current methods in predicting residue-level interactions with A-DNA, B-DNA, and single-stranded DNA, reducing cross-predictions and generating promising leads for potential DNA-binding proteins.
Efforts to elucidate protein-DNA interactions at the molecular level rely in part on accurate predictions of DNA-binding residues in protein sequences. While there are over a dozen computational predictors of the DNA-binding residues, they are DNA-type agnostic and significantly cross-predict residues that interact with other ligands as DNA binding. We leverage a custom-designed machine learning architecture to introduce DNAgenie, first-of-its-kind predictor of residues that interact with A-DNA, B-DNA and single-stranded DNA. DNAgenie uses a comprehensive physiochemical profile extracted from an input protein sequence and implements a two-step refinement process to provide accurate predictions and to minimize the cross-predictions. Comparative tests on an independent test dataset demonstrate that DNAgenie outperforms the current methods that we adapt to predict residue-level interactions with the three DNA types. Further analysis finds that the use of the second (refinement) step leads to a substantial reduction in the cross predictions. Empirical tests show that DNAgenie's outputs that are converted to coarse-grained protein-level predictions compare favorably against recent tools that predict which DNA-binding proteins interact with double-stranded versus single-stranded DNAs. Moreover, predictions from the sequences of the whole human proteome reveal that the results produced by DNAgenie substantially overlap with the known DNA-binding proteins while also including promising leads for several hundred previously unknown putative DNA binders. These results suggest that DNAgenie is a valuable tool for the sequence-based characterization of protein functions. The DNAgenie's webserver is available at http://biomine.cs.vcu.edu/servers/DNAgenie/.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据