4.2 Article

Machine learning-based prediction of activity and substrate specificity for OleA enzymes in the thiolase superfamily

期刊

SYNTHETIC BIOLOGY
卷 5, 期 1, 页码 -

出版社

OXFORD UNIV PRESS
DOI: 10.1093/synbio/ysaa004

关键词

thiolase; p-nitrophenyl esters; substrate specificity; machine learning; enzyme activity screen

资金

  1. U.S. Department of Energy (DOE) Joint Genome Institute, a DOE Office of Science User Facility [DE-AC02-05CH11231]
  2. National Science Foundation [00039202]
  3. National Institutes of Health Biotechnology training grant [5T32GM008347-27]
  4. MnDRIVE initiative for Industry and the Environment

向作者/读者索取更多资源

Enzymes in the thiolase superfamily catalyze carbon-carbon bond formation for the biosynthesis of polyhydroxyalkanoate storage molecules, membrane lipids and bioactive secondary metabolites. Natural and engineered thiolases have applications in synthetic biology for the production of high-value compounds, including personal care products and therapeutics. A fundamental understanding of thiolase substrate specificity is lacking, particularly within the OleA protein family. The ability to predict substrates from sequence would advance (meta)genome mining efforts to identify active thiolases for the production of desired metabolites. To gain a deeper understanding of substrate scope within the OleA family, we measured the activity of 73 diverse bacterial thiolases with a library of 15 p-nitrophenyl ester substrates to build a training set of 1095 unique enzyme-substrate pairs. We then used machine learning to predict thiolase substrate specificity from physicochemical and structural features. The area under the receiver operating characteristic curve was 0.89 for random forest classification of enzyme activity, and our regression model had a test set root mean square error of 0.22 (R-2 = 0.75) to quantitatively predict enzyme activity levels. Substrate aromaticity, oxygen content and molecular connectivity were the strongest predictors of enzyme-substrate pairing. Key amino acid residues A173, I284, V287, T292 and I316 in the Xanthomonas campestris OleA crystal structure lining the substrate binding pockets were important for thiolase substrate specificity and are attractive targets for future protein engineering studies. The predictive framework described here is generalizable and demonstrates how machine learning can be used to quantitatively understand and predict enzyme substrate specificity.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.2
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据