4.7 Article

Comprehensive Analysis of Constraint on the Spatial Distribution of Missense Variants in Human Protein Structures

期刊

AMERICAN JOURNAL OF HUMAN GENETICS
卷 102, 期 3, 页码 415-426

出版社

CELL PRESS
DOI: 10.1016/j.ajhg.2018.01.017

关键词

-

资金

  1. NIH [T32 EY021453, R01 GM080403, R01 GM099842, R01 HL122010, U54 AG052427, UF01 AG07133]
  2. SPORE grant from the Vanderbilt-Ingram Cancer Center
  3. Vanderbilt Ambassadors Discovery Grant in Cancer Research

向作者/读者索取更多资源

The spatial distribution of genetic variation within proteins is shaped by evolutionary constraint and provides insight into the functional importance of protein regions and the potential pathogenicity of protein alterations. Here, we comprehensively evaluate the 3D spatial patterns of human germline and somatic variation in 6,604 experimentally derived protein structures and 33,144 computationally derived homology models covering 77% of all human proteins. Using a systematic approach, we quantify differences in the spatial distributions of neutral germline variants, disease-causing germline variants, and recurrent somatic variants. Neutral missense variants exhibit a general trend toward spatial dispersion, which is driven by constraint on core residues. In contrast, germline disease-causing variants are generally clustered in protein structures and form clusters more frequently than recurrent somatic variants identified from tumor sequencing. In total, we identify 215 proteins with significant spatial constraints on the distribution of disease-causing missense variants in experimentally derived protein structures, only 65 (30%) of which have been previously reported. This analysis identifies many clusters not detectable from sequence information alone; only 12% of proteins with significant clustering in 3D were identified from similar analyses of linear protein sequence. Furthermore, spatial analyses of mutations in homology-based structural models are highly correlated with those from experimentally derived structures, supporting the use of computationally derived models. Our approach highlights significant differences in the spatial constraints on different classes of mutations in protein structure and identifies regions of potential function within individual proteins.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据