4.3 Article

In silico characterization of protein chimeras: Relating sequence and function within the same fold

Journal

PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS
Volume 77, Issue 1, Pages 111-120

Publisher

WILEY
DOI: 10.1002/prot.22422

Keywords

bioinformatics; protein design; DNA shuffling; protein structure; machine learning; kernel method

Funding

  1. UQ-Enabling
  2. ARC Center of Excellence in Bioinformatics

Ask authors/readers for more resources

The exploration of novel proteins via recombination of fragments derived from structurally homologous proteins has enormous potential for medicine and biotechnology. This modular exchange of sequence material puts novel activities, substrate specificities, and stability within reach of a semi-random search. This article takes stock of the growing resource of experimentally characterized chimeric proteins within a homologous protein family to build sequence-function models that can effectively guide the construction of new libraries. A novel framework for predicting structural viability of chimeric proteins, only assuming knowledge of their sequence and their parental structure, is presented. Removing a major barrier in previous work, the model processes any sequence that derives from parents with similar folds. The method naturally mixes test and training data from site-directed recombination, DNA shuffling, or random mutagenesis experiments. We train a model from a site-directed recombination library with state-of-the-art prediction accuracy on hold-out test data from the same experimental source and convincing performance on chimeras with a different origin. Specifically, the model is used to assess the structural viability of P450 chimeras deriving from proteins with only 18% sequence similarity to those used for model timing.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.3
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available