4.7 Article

A Machine Learning Approach for Hot-Spot Detection at Protein-Protein Interfaces

期刊

出版社

MDPI
DOI: 10.3390/ijms17081215

关键词

protein-protein interfaces; hot-spots; machine learning; Solvent Accessible Surface Area (SASA); evolutionary sequence conservation

资金

  1. Fundacao para a Ciencia e a Tecnologia [FCT-SFRH/BPD/97650/2013]
  2. FCT [UID/Multi/04349/2013, IF/00578/2014, UID/NEU/04539/2013]
  3. European Social Fund
  4. Programa Operacional Potencial Humano
  5. Marie Sklodowska-Curie Individual Fellowship [MEMBRANEPROT 659826]
  6. FEDER (Programa Operacional Factores de Competitividade-COMPETE)
  7. Center for Basic and Translational Research on Disorders of the Digestive System, Rockefeller University
  8. Icahn School of Medicine at Mount Sinai

向作者/读者索取更多资源

Understanding protein-protein interactions is a key challenge in biochemistry. In this work, we describe a more accurate methodology to predict Hot-Spots (HS) in protein-protein interfaces from their native complex structure compared to previous published Machine Learning (ML) techniques. Our model is trained on a large number of complexes and on a significantly larger number of different structural-and evolutionary sequence-based features. In particular, we added interface size, type of interaction between residues at the interface of the complex, number of different types of residues at the interface and the Position-Specific Scoring Matrix (PSSM), for a total of 79 features. We used twenty-seven algorithms from a simple linear-based function to support-vector machine models with different cost functions. The best model was achieved by the use of the conditional inference random forest (c-forest) algorithm with a dataset pre-processed by the normalization of features and with up-sampling of the minor class. The method has an overall accuracy of 0.80, an F1-score of 0.73, a sensitivity of 0.76 and a specificity of 0.82 for the independent test set.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据