4.6 Article

MetalExplorer, a Bioinformatics Tool for the Improved Prediction of Eight Types of Metal-Binding Sites Using a Random Forest Algorithm with Two-Step Feature Selection

期刊

CURRENT BIOINFORMATICS
卷 12, 期 6, 页码 480-489

出版社

BENTHAM SCIENCE PUBL LTD
DOI: 10.2174/2468422806666160618091522

关键词

Metal-binding site prediction; random forest; feature selection; functional annotation; machine learning; sequence analysis

资金

  1. National Natural Science Foundation of China [61202167, 61303169, 11250110508, 31350110507]
  2. Major Interdisciplinary Project Grant - Monash University
  3. Chinese Academy of Sciences (CAS)
  4. CAS
  5. ARC Discovery Outstanding Research Award (DORA)

向作者/读者索取更多资源

Background: Metalloproteins are highly involved in many biological processes, including catalysis, recognition, transport, transcription, and signal transduction. The metal ions they bind usually play enzymatic or structural roles in mediating these diverse functional roles. Thus, the systematic analysis and prediction of metal-binding sites using sequence and/or structural information are crucial for understanding their sequence-structure-function relationships. Objective: The objective of this work is to develop a new computational algorithm for improved prediction of major types of metal-binding sites. Method: We propose MetalExplorer (http://metalexplorer.erc.monash.edu.au/), a new machine learning-based method for predicting eight different types of metal-binding sites (Ca, Co, Cu, Fe, Ni, Mg, Mn, and Zn) in proteins. Our approach combines heterogeneous sequence-, structure-, and residue contact network-based features in a random forest machine-learning framework. Results: The predictive performance of MetalExplorer was tested by cross-validation and independent tests using non-redundant datasets of known structures. This method applies a two-step feature selection approach based on the maximum relevance minimum redundancy and forward feature selection to identify the most informative features that contribute to the prediction performance. With a precision of 60%, MetalExplorer achieved high recall values, which ranged from 59% to 88% for the eight metal ion types in fivefold cross-validation tests. Moreover, the common and type-specific features in the optimal subsets of all metal ions were characterized in terms of their contributions to the overall performance. Conclusion: In terms of both benchmark and independent datasets at the 60% precision control level, MetalExplorer compared favorably with an existing metalloprotein prediction tool, SitePredict. MetalExplorer is expected to be a powerful tool for the accurate prediction of potential metal-binding sites and it should facilitate the functional analysis and rational design of novel metalloproteins.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据