4.5 Article

Proposal for an allele nomenclature system based on the evolutionary divergence of haplotypes

Journal

HUMAN MUTATION
Volume 20, Issue 6, Pages 463-472

Publisher

WILEY
DOI: 10.1002/humu.10143

Keywords

nomenclature; variation; allele; evolution; bioinformatics; haplotype

Funding

  1. NIEHS NIH HHS [30 ES06096, R01 ES06321, R01 ES10416, R01 ES08147] Funding Source: Medline

Ask authors/readers for more resources

The classical view of what constitutes an allele has been challenged by recent findings of a great deal of human genetic variability, i.e., we can expect, on average, one variant site every 100-250 bases of our haploid genome. The haplotype is defined as the patterns of co-occurrence of variant sites on the same chromosome (and therefore within each particular gene). Sufficient evidence exists for the divergence of haplotypes during evolution of Homo sapiens sapiens, and the total number of haplotypes per gene will reflect the amount of time any particular ethnic group has existed on the planet, e.g., greatest in Africans, fewer in East Asians, and still fewer in Caucasians. If the average gene spans 30 kb, we can expect similar to 170 polymorphic variant sites per gene in the world population. We do not see 217 haplotypes, however; we might find only 10 to 200 haplotypes (depending on the gene's size and degree of conservation of the gene product). This finite number allows for a reasonable haplotype nomenclature system for each gene, based on evolutionary divergence. For polymorphic variants (i.e., frequency greater than or equal to 0.01), I propose using Arabic numerals for the major clades (e.g., *1, *2,...*20, *21), capital letters for sublineages (e.g., *2A, *2B, *2C), and Arabic numerals for sub-sublineages (e.g., *22G12, *22G13); additional subcategories may be added, in an alternating number/letter/number/letter sequence, depending on the complexity of present-day haplotypes of a particular gene. Web sites with a web master and external advisory committee should be set up for each gene superfamily, family, or individual gene (depending on complexity), and an international haplotype nomenclature committee, perhaps comprised of several dozen of these web masters, should oversee haplotype nomenclature for the entire human genome. The higher heterozygosity and multiallelic nature makes haplotypes more informative than biallelic SNPs. Ultimately, our knowledge of haplotype patterns, rather than single variant sites, of perhaps several hundred genes will likely be helpful in finding associations between genotype and any multiplex phenotype (e.g., complex diseases including cancer, and/or toxicity of pharmaceutical agents or environmental pollutants).

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available