4.6 Article

Size distribution of function-based human gene sets and the split-merge model

期刊

ROYAL SOCIETY OPEN SCIENCE
卷 3, 期 8, 页码 -

出版社

ROYAL SOC
DOI: 10.1098/rsos.160275

关键词

gene family sizes; gene set sizes; power-law; beta rank function

资金

  1. Robert S. Boas Center for Genomics and Human Genetics
  2. PAPIIT/UNAM [IN107414]
  3. PASPA/UNAM
  4. CONACYT Mexico

向作者/读者索取更多资源

The sizes of paralogues-gene families produced by ancestral duplication-are known to follow a power-law distribution. We examine the size distribution of gene sets or gene families where genes are grouped by a similar function or share a common property. The size distribution of Human Gene Nomenclature Committee (HGNC) gene sets deviate from the power-law, and can be fitted much better by a beta rank function. We propose a simple mechanism to break a power-law size distribution by a combination of splitting and merging operations. The largest gene sets are split into two to account for the subfunctional categories, and a small proportion of other gene sets are merged into larger sets as new common themes might be realized. These operations are not uncommon for a curator of gene sets. A simulation shows that iteration of these operations changes the size distribution of Ensembl paralogues and could lead to a distribution fitted by a rank beta function. We further illustrate application of beta rank function by the example of distribution of transcription factors and drug target genes among HGNC gene families.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据