4.6 Article

Sources of Error Inherent in Species-Tree Estimation: Impact of Mutational and Coalescent Effects on Accuracy and Implications for Choosing among Different Methods

期刊

SYSTEMATIC BIOLOGY
卷 59, 期 5, 页码 573-583

出版社

OXFORD UNIV PRESS
DOI: 10.1093/sysbio/syq047

关键词

Coalescence; gene tree; lineage sorting; mutation; phylogenetics; sample design; species tree

资金

  1. National Science Foundation [DEB-0918218]
  2. Miller Institute, University of California, Berkeley
  3. Direct For Biological Sciences
  4. Division Of Environmental Biology [0918218] Funding Source: National Science Foundation
  5. Division Of Environmental Biology
  6. Direct For Biological Sciences [0918195] Funding Source: National Science Foundation

向作者/读者索取更多资源

Discord in the estimated gene trees among loci can be attributed to both the process of mutation and incomplete lineage sorting. Effectively modeling these two sources of variation-mutational and coalescent variance-provides two distinct challenges for phylogenetic studies. Despite extensive investigation on mutational models for gene-tree estimation over the past two decades and recent attention to modeling of the coalescent process for phylogenetic estimation, the effects of these two variances have yet to be evaluated simultaneously. Here, we partition the effects of mutational and coalescent processes on phylogenetic accuracy by comparing the accuracy of species trees estimated from gene trees (i.e., the actual coalescent genealogies) with that of species trees estimated from estimated gene trees (i.e., trees estimated from nucleotide sequences, which contain both coalescent and mutational variance). Not only is there a significant contribution of both mutational and coalescent variance to errors in species-tree estimates, but the relative magnitude of the effects on the accuracy of species-tree estimation also differs systematically depending on 1) the timing of divergence, 2) the sampling design, and 3) the method used for species-tree estimation. These findings explain why using more information contained in gene trees (e.g., topology and branch lengths as opposed to just topology) does not necessarily translate into pronounced gains in accuracy, highlighting the strengths and limits of different methods for species-tree estimation. Differences in accuracy scores between methods for different sampling regimes also emphasize that it would be a mistake to assume more computationally intensive species-tree estimation procedures that will always provide better estimates of species trees. To the contrary, the performance of a method depends not only on the method per se but also on the compatibilities between the input genetic data and the method as determined by the relative impact of mutational and coalescent variance.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据