4.6 Article

Correcting the Bias of Empirical Frequency Parameter Estimators in Codon Models

期刊

PLOS ONE
卷 5, 期 7, 页码 -

出版社

PUBLIC LIBRARY SCIENCE
DOI: 10.1371/journal.pone.0011230

关键词

-

资金

  1. Joint Division of Mathematical Sciences/National Institute of General Medical Sciences Mathematical Biology Initiative [NSF-0714991]
  2. National Institutes of Health [AI47745]
  3. University of California, San Diego Center for AIDS Research/NIAID [AI36214]

向作者/读者索取更多资源

Markov models of codon substitution are powerful inferential tools for studying biological processes such as natural selection and preferences in amino acid substitution. The equilibrium character distributions of these models are almost always estimated using nucleotide frequencies observed in a sequence alignment, primarily as a matter of historical convention. In this note, we demonstrate that a popular class of such estimators are biased, and that this bias has an adverse effect on goodness of fit and estimates of substitution rates. We propose a corrected empirical estimator that begins with observed nucleotide counts, but accounts for the nucleotide composition of stop codons. We show via simulation that the corrected estimates outperform the de facto standard F3 x 4 estimates not just by providing better estimates of the frequencies themselves, but also by leading to improved estimation of other parameters in the evolutionary models. On a curated collection of 856 sequence alignments, our estimators show a significant improvement in goodness of fit compared to the F3 x 4 approach. Maximum likelihood estimation of the frequency parameters appears to be warranted in many cases, albeit at a greater computational cost. Our results demonstrate that there is little justification, either statistical or computational, for continued use of the F3 x 4-style estimators.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据