4.6 Article

Mapping OMIM Disease-Related Variations on Protein Domains Reveals an Association Among Variation Type, Pfam Models, and Disease Classes

Journal

FRONTIERS IN MOLECULAR BIOSCIENCES
Volume 8, Issue -, Pages -

Publisher

FRONTIERS MEDIA SA
DOI: 10.3389/fmolb.2021.617016

Keywords

protein variations; protein structure; protein domain; variation type; disease-related variations; disease variant databases; Pfam-disease association

Funding

  1. PRIN2017 grant from the Italian Ministry of University and Research [2017483NH8_002]

Ask authors/readers for more resources

Human genome resequencing projects provide detailed data on single-nucleotide variations in protein-coding regions, many of which are linked to genetic diseases. However, understanding the molecular mechanisms behind these diseases is still limited. This study aims to identify Pfam domains statistically associated with disease-related variations, using 2,513 human protein sequences and 22,763 disease-related variations. The study finds that Pfam models can serve as specific markers to bridge genes, diseases, and disease classes.
Human genome resequencing projects provide an unprecedented amount of data about single-nucleotide variations occurring in protein-coding regions and often leading to observable changes in the covalent structure of gene products. For many of these variations, links to Online Mendelian Inheritance in Man (OMIM) genetic diseases are available and are reported in many databases that are collecting human variation data such as Humsavar. However, the current knowledge on the molecular mechanisms that are leading to diseases is, in many cases, still limited. For understanding the complex mechanisms behind disease insurgence, the identification of putative models, when considering the protein structure and chemico-physical features of the variations, can be useful in many contexts, including early diagnosis and prognosis. In this study, we investigate the occurrence and distribution of human disease-related variations in the context of Pfam domains. The aim of this study is the identification and characterization of Pfam domains that are statistically more likely to be associated with disease-related variations. The study takes into consideration 2,513 human protein sequences with 22,763 disease-related variations. We describe patterns of disease-related variation types in biunivocal relation with Pfam domains, which are likely to be possible markers for linking Pfam domains to OMIM diseases. Furthermore, we take advantage of the specific association between disease-related variation types and Pfam domains for clustering diseases according to the Human Disease Ontology, and we establish a relation among variation types, Pfam domains, and disease classes. We find that Pfam models are specific markers of patterns of variation types and that they can serve to bridge genes, diseases, and disease classes. Data are available as Supplementary Material for 1,670 Pfam models, including 22,763 disease-related variations associated to 3,257 OMIM diseases.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available