4.7 Article

PERISCOPE-Opt: Machine learning-based prediction of optimal fermentation conditions and yields of recombinant periplasmic protein expressed in Escherichia coli

Journal

COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL
Volume 20, Issue -, Pages 2909-2920

Publisher

ELSEVIER
DOI: 10.1016/j.csbj.2022.06.006

Keywords

Optimization; Machine learning; Recombinant protein production; Periplasmic expression; Prediction model; Escherichia coli

Funding

  1. Fundamental Research Grant Scheme (FRGS) , Malaysia [FRGS/1/2016/TK02/MUSM/02/3]
  2. Monash University
  3. Graduate Research Scholarships, Monash University Malaysia

Ask authors/readers for more resources

This study combines features from fermentation process conditions and amino acid sequences to construct a machine learning-based model for predicting the maximal protein yields and corresponding fermentation conditions for recombinant protein production. The model achieves good performance in independent tests and provides a reliable alternative to trial-and-error experiments.
Optimization of the fermentation process for recombinant protein production (RPP) is often resource intensive. Machine learning (ML) approaches are helpful in minimizing the experimentations and find vast applications in RPP. However, these ML-based tools primarily focus on features with respect to amino-acid-sequence, ruling out the influence of fermentation process conditions. The present study combines the features derived from fermentation process conditions with that from amino acid sequence to construct an ML-based model that predicts the maximal protein yields and the corresponding fermentation conditions for the expression of target recombinant protein in the Escherichia coli periplasm. Two sets of XGBoost classifiers were employed in the first stage to classify the expression levels of the target protein as high (>50 mg/L), medium (between 0.5 and 50 mg/L), or low (<0.5 mg/L). The second-stage framework consisted of three regression models involving support vector machines and random forest to predict the expression yields corresponding to each expression-level-class. Independent tests showed that the predictor achieved an overall average accuracy of 75% and a Pearson coefficient correlation of 0.91 for the correctly classified instances. Therefore, our model offers a reliable substitution of numerous trial-and-error experiments to identify the optimal fermentation conditions and yield for RPP. It is also implemented as an open-access webserver, PERISCOPE-Opt (http:// periscope-opt.erc.monash.edu).(c) 2022 The Authors. Published by Elsevier B.V. on behalf of Research Network of Computational and Structural Biotechnology. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available