4.8 Article

Complexity of the simplest species tree problem

期刊

MOLECULAR BIOLOGY AND EVOLUTION
卷 38, 期 9, 页码 3993-4009

出版社

OXFORD UNIV PRESS
DOI: 10.1093/molbev/msab009

关键词

concatenation; efficiency; molecular clock; MSC; multispecies coalescent; species tree

资金

  1. Biotechnology and Biological Sciences Research Council [BB/P006493/1]
  2. BBSRC equipment grant [BB/R01356X/1]
  3. Natural Science Foundation [32070685, 31671370]
  4. Youth Innovation Promotion Association of Chinese Academy of Sciences [201901]
  5. BBSRC [BB/R01356X/1, BB/P006493/1] Funding Source: UKRI

向作者/读者索取更多资源

This study investigated the identifiability, consistency, and efficiency of different species tree methods in the case of three species and three sequences under the molecular clock using mathematical analysis and computer simulation. The results suggest that full-likelihood methods are considerably more efficient than summary methods and can provide estimates of important parameters.
The multispecies coalescent model provides a natural framework for species tree estimation accounting for gene-tree conflicts. Although a number of species tree methods under the multispecies coalescent have been suggested and evaluated using simulation, their statistical properties remain poorly understood. Here, we use mathematical analysis aided by computer simulation to examine the identifiability, consistency, and efficiency of different species tree methods in the case of three species and three sequences under the molecular clock. We consider four major species-tree methods including concatenation, two-step, independent-sites maximum likelihood, and maximum likelihood. We develop approximations that predict that the probit transform of the species tree estimation error decreases linearly with the square root of the number of loci. Even in this simplest case, major differences exist among the methods. Full-likelihood methods are considerably more efficient than summary methods such as concatenation and two-step. They also provide estimates of important parameters such as species divergence times and ancestral population sizes,whereas these parameters are not identifiable by summary methods. Our results highlight the need to improve the statistical efficiency of summary methods and the computational efficiency of full likelihood methods of species tree estimation.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.8
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据