4.8 Article

Accuracy of coalescent likelihood estimates: Do we need more sites, more sequences, or more loci?

Journal

MOLECULAR BIOLOGY AND EVOLUTION
Volume 23, Issue 3, Pages 691-700

Publisher

OXFORD UNIV PRESS
DOI: 10.1093/molbev/msj079

Keywords

coalescent; maximum likelihood; population size; sampling design

Funding

  1. NIGMS NIH HHS [R01 GM51929, R01 GM071639] Funding Source: Medline

Ask authors/readers for more resources

A computer simulation study has been made of the accuracy of estimates of Theta = 4N(e)mu from a sample from a single isolated population of finite size. The accuracies turn out to be well predicted by a formula developed by Fu and Li, who used optimistic assumptions. Their formulas are restated in terms of accuracy, defined here as the reciprocal of the squared coefficient of variation. This should be proportional to sample size when the entities sampled provide independent information. Using these formulas for accuracy, the sampling strategy for estimation of Theta can be investigated. Two models for cost have been used, a cost-per-base model and a cost-per-read model. The former would lead us to prefer to have a very large number of loci, each one base long. The latter, which is more realistic, causes us to prefer to have one read per locus and an optimum sample size which declines as costs of sampling organisms increase. For realistic values, the optimum sample size is 8 or fewer individuals. This is quite close to the results obtained by Pluzhnikov and Donnelly for a cost-per-base model, evaluating other estimators of Theta It can be understood by considering that the resources spent collecting larger samples prevent us from considering more loci. An examination of the efficiency of Watterson's estimator of Theta was also made, and it was found to be reasonably efficient when the number of mutants per generation in the sequence in the whole population is less than 2.5.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.8
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available