4.7 Article

Semisupervised Gaussian Process for Automated Enzyme Search

Journal

ACS SYNTHETIC BIOLOGY
Volume 5, Issue 6, Pages 518-528

Publisher

AMER CHEMICAL SOC
DOI: 10.1021/acssynbio.5b00294

Keywords

semisupervised Gaussian process; Gaussian process regression; metabolic engineering; enzyme screening; enzyme kinetics; reaction fingerprint

Funding

  1. BBSRC/EPSRC [BB/M017702/1]
  2. Biotechnology and Biological Sciences Research Council [BB/M017702/1] Funding Source: researchfish
  3. BBSRC [BB/M017702/1] Funding Source: UKRI

Ask authors/readers for more resources

Synthetic biology is today harnessing the design of novel and greener biosynthesis routes for the production of added-value chemicals and natural products. The design of novel pathways often requires a detailed selection of enzyme sequences to import into the chassis at each of the reaction steps. To address such design requirements in an automated way, we present here a tool for exploring the space of enzymatic reactions. Given a reaction and an enzyme the tool provides a probability estimate that the enzyme catalyzes the reaction. Our tool first considers the similarity of a reaction to known biochemical reactions with respect to signatures around their reaction centers. Signatures are defined based on chemical transformation rules by using extended connectivity fingerprint descriptors. A semisupervised Gaussian process model associated with the similar known reactions then provides the probability estimate. The Gaussian process model uses information about both the reaction and the enzyme in providing the estimate. These estimates were validated experimentally by the application of the Gaussian process model to a newly identified metabolite in Escherichia coli in order to search for the enzymes catalyzing its associated reactions. Furthermore, we show with several pathway design examples how such ability to assign probability estimates to enzymatic reactions provides the potential to assist in bioengineering applications, providing experimental validation to our proposed approach. To the best of our knowledge, the proposed approach is the first application of Gaussian processes dealing with biological sequences and chemicals, the use of a semisupervised Gaussian process framework is also novel in the context of machine learning applied to bioinformatics. However, the ability of an enzyme to catalyze a reaction depends on the affinity between the substrates of the reaction and the enzyme. This affinity is generally quantified by the Michaelis constant K-M. Therefore, we also demonstrate using Gaussian process regression to predict K-M given a substrate-enzyme pair.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available