Journal
NATURE CHEMICAL BIOLOGY
Volume 14, Issue 12, Pages 1109-+Publisher
NATURE PUBLISHING GROUP
DOI: 10.1038/s41589-018-0154-9
Keywords
-
Categories
Funding
- BBSRC [EGA16205, EGA16206, EGA17763]
- EPSRC (UK Catalysis Hub) [EP/K014668/1, EP/M013219/1]
- EPSRC [EP/K014668/1] Funding Source: UKRI
Ask authors/readers for more resources
The elucidation and prediction of how changes in a protein result in altered activities and selectivities remain a major challenge in chemistry. Two hurdles have prevented accurate family-wide models: obtaining (i) diverse datasets and (ii) suitable parameter frameworks that encapsulate activities in large sets. Here, we show that a relatively small but broad activity dataset is sufficient to train algorithms for functional prediction over the entire glycosyltransferase superfamily 1 (GT1) of the plant Arabidopsis thaliana. Whereas sequence analysis alone failed for GT1 substrate utilization patterns, our chemical-bioinformatic model, GT-Predict, succeeded by coupling physicochemical features with isozyme-recognition patterns over the family. GT-Predict identified GT1 biocatalysts for novel substrates and enabled functional annotation of uncharacterized GT1s. Finally, analyses of GT-Predict decision pathways revealed structural modulators of substrate recognition, thus providing information on mechanisms. This multifaceted approach to enzyme prediction may guide the streamlined utilization (and design) of bio-catalysts and the discovery of other family-wide protein functions.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available