4.7 Article

Machine Learning Enables Prediction of Pyrrolysyl-tRNA Synthetase Substrate Specificity

期刊

ACS SYNTHETIC BIOLOGY
卷 12, 期 8, 页码 2403-2417

出版社

AMER CHEMICAL SOC
DOI: 10.1021/acssynbio.3c00225

关键词

machine learning; pyrrolysyl-tRNA synthetase; noncanonical amino acids; enzyme engineering; substratespecificity

向作者/读者索取更多资源

This study developed machine learning models to predict the substrate specificity of PylRS for novel NCAAs. The models showed high accuracy and provided a framework for expanding the substrate scope of PylRS variants and developing machine learning models for other PylRS variants.
Knowledge about the substrate scope for a given enzymeis informativefor elucidating biochemical pathways and also for expanding applicationsof the enzyme. However, no general methods are available to accuratelypredict the substrate specificity of an enzyme. Pyrrolysyl-tRNA synthetase(PylRS) is a powerful tool for incorporating various noncanonicalamino acids (NCAAs) into proteins, which enabled us to probe, image,rationally engineer, and evolve protein structure and function. However,the incorporation of a new NCAA typically requires the selection oflarge libraries of PylRS with randomized mutations at active sites,and this process requires multiple rounds of selection for each newsubstrate. Therefore, a single aminoacyl-tRNA synthetase with broadsubstrate promiscuity is ideal to facilitate widespread applicationsof the genetic NCAA incorporation technique. Herein, machine learningmodels were developed to predict the substrate specificity of PylRSto accept novel NCAAs that could be incorporated into proteins bythree PylRS mutants. The models were built from a training set of285 unique enzyme-substrate pairs of three PylRS mutants includingIFRS, BtaRS, and MFRS against 95 NCAAs. The best BaggingTree (BT)model was then used for virtually screening a NCAAs library containing1474 phenylalanine, tyrosine, tryptophan, and alanine analogues, and156 NCAAs were predicted to be accepted by at least one of the threePylRS mutants. Then, 27 NCAAs including 24 positive and 3 negativesubstrates were experimentally tested for their activities, and 20of the 24 positive substrates showed weak or strong activity and wereaccepted by at least one PylRS mutant, among which 11 NCAAs were neverreported to be incorporated into proteins before. Three negative substratesdid not show any activity. Experimental results suggested that theBT model provides a three-class classification accuracy of 0.69 anda binary classification accuracy of 0.86. This study expanded thesubstrate scope of three PylRS variants and provided a framework fordeveloping machine learning models to predict substrate specificityof other PylRS variants.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据