4.8 Article

Accuracy and power of Bayes prediction of amino acid sites under positive selection

Journal

MOLECULAR BIOLOGY AND EVOLUTION
Volume 19, Issue 6, Pages 950-958

Publisher

OXFORD UNIV PRESS
DOI: 10.1093/oxfordjournals.molbev.a004152

Keywords

Bayes inference; likelihood; nonsynonymous-synonymous rate ratio; positive selection; posterior probability

Ask authors/readers for more resources

Bayes prediction quantifies uncertainty by assigning posterior probabilities. It Was used to identify amino acids in a protein under recurrent diversifying selection indicated by higher nonsynonymous, (d(N)) than synonymous (d(S)) substitution rates or by omega = d(N)/d(S) > 1. Parameters were estimated by maximum likelihood under a codon substitution model that assumed several classes of sites with different w ratios. The Bayes theorem was used to calculate the posterior probabilities of each site falling into these site classes. Here. we evaluate the performance of Bayes prediction of amino acids under positive selection by computer simulation. We measured the accuracy by the proportion of predicted sites that were truly under selection and the power by the proportion of true positively selected sites that were predicted by the method. The accuracy was slightly better for longer sequences, whereas the power was largely unaffected by the increase in sequence length. Both accuracy and power were higher for medium or highly diverged sequences than for similar sequences. We found that accuracy and power were unacceptably low when data contained only a few highly similar sequences. However, sampling a large number of lineage improved the performance substantially. Even for very similar sequences. accuracy and Power can he high if over 100 taxa are used in the analysis. We make the following recommendations: (1) prediction of positive selection sites is not feasible for a few closely related sequences: (2) using it large number of lineages is the best way to improve the accuracy and power of the prediction: and (3) multiple models of heterogeneous selective pressures among sites should he applied in real data analysis.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.8
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available