4.6 Article

Usage of a dataset of NMR resolved protein structures to test aggregation versus solubility prediction algorithms

期刊

PROTEIN SCIENCE
卷 26, 期 9, 页码 1864-1869

出版社

WILEY
DOI: 10.1002/pro.3225

关键词

NMR; soluble; database; aggregation; 3D structure; amyloid fibrils; computational approaches

资金

  1. Institut de Biologie Computationnelle
  2. Universite de Montpellier (ANR Investissements D'Avenir Bio-informatique: projet IBC)
  3. Ministere de l'Education nationale, de l'Enseignement superieur et de la Recherche (MEESR)
  4. COST Action [BM1405]

向作者/读者索取更多资源

There has been an increased interest in computational methods for amyloid and (or) aggregate prediction, due to the prevalence of these aggregates in numerous diseases and their recently discovered functional importance. To evaluate these methods, several datasets have been compiled. Typically, aggregation-prone regions of proteins, which form aggregates or amyloids in vivo, are more than 15 residues long and intrinsically disordered. However, the number of such experimentally established amyloid forming and non-forming sequences are limited, not exceeding one hundred entries in existing databases. In this work, we parsed all available NMR-resolved protein structures from the PDB and assembled a new, sevenfold larger, dataset of unfolded sequences, soluble at high concentrations. We proposed to use these sequences as a negative set for evaluating methods for predicting aggregation in vivo. We also present the results of benchmarking cutting edge tools for the prediction of aggregation versus solubility propensity.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据