4.7 Article

EPSOL: sequence-based protein solubility prediction using multidimensional embedding

期刊

BIOINFORMATICS
卷 37, 期 23, 页码 4314-4320

出版社

OXFORD UNIV PRESS
DOI: 10.1093/bioinformatics/btab463

关键词

-

资金

  1. National Key Research and Development Program of China [2018YFC0910403]
  2. National Natural Science Foundation of China [62072353, 61672406, 61532014]

向作者/读者索取更多资源

This article introduces a novel deep learning architecture named EPSOL for predicting protein solubility in an E.coli expression system, which achieves high accuracy and reliability in predicting the solubility of new recombinant proteins.
Motivation: The heterologous expression of recombinant protein requires host cells, such as Escherichiacoli, and the solubility of protein greatly affects the protein yield. A novel and highly accurate solubility predictor that concurrently improves the production yield and minimizes production cost, and that forecasts protein solubility in an E.coli expression system before the actual experimental work is highly sought. Results: In this article, EPSOL, a novel deep learning architecture for the prediction of protein solubility in an E.coli expression system, which automatically obtains comprehensive protein feature representations using multidimensional embedding, is presented. EPSOL outperformed all existing sequence-based solubility predictors and achieved 0.79 in accuracy and 0.58 in Matthew's correlation coefficient. The higher performance of EPSOL permits large-scale screening for sequence variants with enhanced manufacturability and predicts the solubility of new recombinant proteins in an E.coli expression system with greater reliability.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据