4.7 Article

GR-Align: fast and flexible alignment of protein 3D structures using graphlet degree similarity

期刊

BIOINFORMATICS
卷 30, 期 9, 页码 1259-1265

出版社

OXFORD UNIV PRESS
DOI: 10.1093/bioinformatics/btu020

关键词

-

资金

  1. European Research Council (ERC) Starting Independent Researcher [278212]
  2. National Science Foundation (NSF)
  3. Cyber-Enabled Discovery and Innovation (CDI) [OIA-1028394]
  4. ARRS [J1-5454]
  5. Serbian Ministry of Education and Science Project [III44006]
  6. Office of Integrative Activities
  7. Office Of The Director [1028394] Funding Source: National Science Foundation

向作者/读者索取更多资源

Motivation: Protein structure alignment is key for transferring information from well-studied proteins to less studied ones. Structural alignment identifies the most precise mapping of equivalent residues, as structures are more conserved during evolution than sequences. Among the methods for aligning protein structures, maximum Contact Map Overlap (CMO) has received sustained attention during the past decade. Yet, known algorithms exhibit modest performance and are not applicable for large-scale comparison. Results: Graphlets are small induced subgraphs that are used to design sensitive topological similarity measures between nodes and networks. By generalizing graphlets to ordered graphs, we introduce GR-Align, a CMO heuristic that is suited for database searches. On the Proteus_300 set (44 850 protein domain pairs), GR-Align is several orders of magnitude faster than the state-of-the-art CMO solvers Apurva, MSVNS and AlEigen7, and its similarity score is in better agreement with the structural classification of proteins. On a largescale experiment on the Gold-standard benchmark dataset (3 207 270 protein domain pairs), GR-Align is several orders of magnitude faster than the state-of-the-art protein structure comparison tools TM-Align, DaliLite, MATT and Yakusa, while achieving similar classification performances. Finally, we illustrate the difference between GR-Align's flexible alignments and the traditional ones by querying a flexible protein in the Astral-40 database (11 154 protein domains). In this experiment, GR-Align's top scoring alignments are not only in better agreement with structural classification of proteins, but also that they allow transferring more information across proteins.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据