4.5 Article

The potential of random forest and neural networks for biomass and recombinant protein modeling in Escherichia coli fed-batch fermentations

Journal

BIOTECHNOLOGY JOURNAL
Volume 10, Issue 11, Pages 1770-1782

Publisher

WILEY-V C H VERLAG GMBH
DOI: 10.1002/biot.201400790

Keywords

Artificial neural networks; Process analytical technology; Quality by design; Random forest; Recombinant protein production

Funding

  1. Federal Ministry of Traffic, Innovation and Technology (bmvit)
  2. Federal Ministry of Economy, Family and Youth (BMWFJ)
  3. Styrian Business Promotion Agency SFG
  4. Standortagentur Tirol
  5. ZIT - Technology Agency of the City of Vienna through the COMET-Funding Program

Ask authors/readers for more resources

Product quality assurance strategies in production of biopharmaceuticals currently undergo a transformation from empirical quality by testing to rational, knowledge-based quality by design approaches. The major challenges in this context are the fragmentary understanding of bioprocesses and the severely limited real-time access to process variables related to product quality and quantity. Data driven modeling of process variables in combination with model predictive process control concepts represent a potential solution to these problems. The selection of statistical techniques best qualified for bioprocess data analysis and modeling is a key criterion. In this work a series of recombinant Escherichia coli fed-batch production processes with varying cultivation conditions employing a comprehensive on- and offline process monitoring platform was conducted. The applicability of two machine learning methods, random forest and neural networks, for the prediction of cell dry mass and recombinant protein based on online available process parameters and two-dimensional multi-wavelength fluorescence spectroscopy is investigated. Models solely based on routinely measured process variables give a satisfying prediction accuracy of about +/- 4% for the cell dry mass, while additional spectroscopic information allows for an estimation of the protein concentration within +/- 12%. The results clearly argue for a combined approach: neural networks as modeling technique and random forest as variable selection tool.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available