Journal
MOLECULAR BIOLOGY AND EVOLUTION
Volume 28, Issue 3, Pages 1217-1228Publisher
OXFORD UNIV PRESS
DOI: 10.1093/molbev/msq303
Keywords
codon model; likelihood ratio test; positive selection; branch-site test
Funding
- K.C. Wong Education Foundation, Hong Kong
Ask authors/readers for more resources
The branch-site test is a likelihood ratio test to detect positive selection along prespecified lineages on a phylogeny that affects only a subset of codons in a protein-coding gene, with positive selection indicated by accelerated nonsynonymous substitutions (with omega = d(N)/d(S) > 1). This test may have more power than earlier methods, which average nucleotide substitution rates over sites in the protein and/or over branches on the tree. However, a few recent studies questioned the statistical basis of the test and claimed that the test generated too many false positives. In this paper, we examine the null distribution of the test and conduct a computer simulation to examine the false-positive rate and the power of the test. The results suggest that the asymptotic theory is reliable for typical data sets, and indeed in our simulations, the large-sample null distribution was reliable with as few as 20-50 codons in the alignment. We examined the impact of sequence length, the strength of positive selection, and the proportion of sites under positive selection on the power of the branch-site test. We found that the test was far more powerful in detecting episodic positive selection than branch-based tests, which average substitution rates over all codons in the gene and thus miss the signal when most codons are under strong selective constraint. Recent claims of statistical problems with the branch-site test are due to misinterpretations of simulation results. Our results, as well as previous simulation studies that have demonstrated the robustness of the test, suggest that the branch-site test may be a useful tool for detecting episodic positive selection and for generating biological hypotheses for mutation studies and functional analyses. The test is sensitive to sequence and alignment errors and caution should be exercised concerning its use when data quality is in doubt.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available