4.8 Article

Navigating the protein fitness landscape with Gaussian processes

出版社

NATL ACAD SCIENCES
DOI: 10.1073/pnas.1215251110

关键词

protein engineering; recombination; machine learning; experimental design; active learning

资金

  1. National Institutes of Health
  2. Institute for Collaborative Biotechnologies through US Army Research Office [W911NF-09-0001]
  3. Swiss National Science Foundation [200021_137971]
  4. Swiss National Science Foundation (SNF) [200021_137971] Funding Source: Swiss National Science Foundation (SNF)

向作者/读者索取更多资源

Knowing how protein sequence maps to function (the fitness landscape) is critical for understanding protein evolution as well as for engineering proteins with new and useful properties. We demonstrate that the protein fitness landscape can be inferred from experimental data, using Gaussian processes, a Bayesian learning technique. Gaussian process landscapes can model various protein sequence properties, including functional status, thermostability, enzyme activity, and ligand binding affinity. Trained on experimental data, these models achieve unrivaled quantitative accuracy. Furthermore, the explicit representation of model uncertainty allows for efficient searches through the vast space of possible sequences. We develop and test two protein sequence design algorithms motivated by Bayesian decision theory. The first one identifies small sets of sequences that are informative about the landscape; the second one identifies optimized sequences by iteratively improving the Gaussian process model in regions of the landscape that are predicted to be optimized. We demonstrate the ability of Gaussian processes to guide the search through protein sequence space by designing, constructing, and testing chimeric cytochrome P450s. These algorithms allowed us to engineer active P450 enzymes that are more thermostable than any previously made by chimeragenesis, rational design, or directed evolution.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.8
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据