4.4 Article

Sequence-Based Prediction of Cysteine Reactivity Using Machine Learning

期刊

BIOCHEMISTRY
卷 57, 期 4, 页码 451-460

出版社

AMER CHEMICAL SOC
DOI: 10.1021/acs.biochem.7b00897

关键词

-

资金

  1. Ministry of Science and Technology of the People's Republic of China (National Key R&D Program of China) [2016YFA0.501500]
  2. National Natural Science Foundation of China [21472008, 81490740]
  3. 1000 Talents Plan Young Investigator Award

向作者/读者索取更多资源

As one of the most intrinsically reactive amino acids, cysteine carries a variety of important biochemical functions, including catalysis and redox regulation. Discovery and characterization of cysteines with heightened reactivity will help annotate protein functions. Chemical proteomic methods have been used to quantitatively profile cysteine reactivity in native proteomes, showing a strong correlation between the chemical reactivity of a cysteine and its functionality; however, the relationship between the cysteine reactivity and its local sequence has not yet been systematically explored. Herein, we report a machine learning method, sbPCR (sequence-based prediction of cysteine reactivity), which combines the basic local alignment search tool, truncated composition of k spaced amino acid pair analysis, and support vector machine to predict cysteines with hyper-reactivity based on only local sequence features. Using a benchmark set compiled from hyper-reactive cysteines in human proteomes, our method can achieve a prediction accuracy of 98%, a precision of 9.5%, and a recall ratio of 89%. We utilized these governing features of local sequence motifs to expand the prediction to potential hyper-reactive cysteines in other proteomes deposited in the UniProt database. We validated our predictions in Escherichia coli by activity-based protein profiling and discovered a hyper-reactive cysteine from a functionally uncharacterized protein, YecH. Biochemical analysis suggests that the hyper-reactive cysteine might be involved in metal binding. Our computational method provides a large inventory of potential hyper-reactive cysteines in proteomes and is highly complementary to other experimental approaches to guide systematic annotation of protein functions in the postgenome era.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.4
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据