4.7 Article

Interpreting a black box predictor to gain insights into early folding mechanisms

期刊

出版社

ELSEVIER
DOI: 10.1016/j.csbj.2021.08.041

关键词

-

资金

  1. Research Foundation Flanders (FWO) [G.0328.16 N]
  2. Flemish Government (AI Research Program)
  3. Brussels-Capital Region
  4. European Regional Development Fund (ERDF)

向作者/读者索取更多资源

This study investigates a novel approach to predict early folding residues (EFRs) from protein sequences to gain mechanistic residue-level insights into the sequence determinants of EFRs in proteins.
Protein folding and function are closely connected, but the exact mechanisms by which proteins fold remain elusive. Early folding residues (EFRs) are amino acids within a particular protein that induce the very first stages of the folding process. High-resolution EFR data are only available for few proteins, which has previously enabled the training of a protein sequence-based machine learning 'black box' predictor (EFoldMine). Such a black box approach does not allow a direct extraction of the 'early folding rules' embedded in the protein sequence, whilst such interpretation is essential to improve our understanding of how the folding process works. We here apply and investigate a novel 'grey box' approach to the prediction of EFRs from protein sequence to gain mechanistic residue-level insights into the sequence determinants of EFRs in proteins. We interpret the rule set for three datasets, a default set comprised of natural proteins, a scrambled set comprised of the scrambled default set sequences, and a set of de novo designed proteins. Finally, we relate these data to the secondary structure adopted in the folded protein and provide all information online via http://xefoldmine.bio2byte.be/, as a resource to help understand and steer early protein folding. (c) 2021 The Author(s). Published by Elsevier B.V. on behalf of Research Network of Computational and Structural Biotechnology. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据