4.6 Article

Consistency of SVDQuartets and Maximum Likelihood for Coalescent-Based Species Tree Estimation

期刊

SYSTEMATIC BIOLOGY
卷 70, 期 1, 页码 33-48

出版社

OXFORD UNIV PRESS
DOI: 10.1093/sysbio/syaa039

关键词

Consistency; gene tree; maximum likelihood; multilocus data; hylogenetic inference; species tree; SVDQuartets

向作者/读者索取更多资源

The study reveals that SVDQuartets is statistically consistent for all data types, while ML is consistent only for CIS data under the JC69 model. Proof of consistency for the more general multilocus case remains challenging.
Numerous methods for inferring species-level phylogenies under the coalescent model have been proposed within the last 20 years, and debates continue about the relative strengths and weaknesses of these methods. One desirable property of a phylogenetic estimator is that of statistical consistency, which means intuitively that as more data are collected, the probability that the estimated tree has the same topology as the true tree goes to 1. To date, consistency results for species tree inference under the multispecies coalescent (MSC) have been derived only for summary statistics methods, such as ASTRAL and MP-EST. These methods have been found to be consistent given true gene trees but may be inconsistent when gene trees are estimated from data for loci of finite length. Here, we consider the question of statistical consistency for four taxa for SVDQuartets for general data types, as well as for the maximum likelihood (ML) method in the case in which the data are a collection of sites generated under the MSC model such that the sites are conditionally independent given the species tree (we call these data coalescent independent sites [CIS] data). We show that SVDQuartets is statistically consistent for all data types (i.e., for both CIS data and formultilocus data), and we derive its rate of convergence. We additionally show that ML is consistent for CIS data under the JC69 model and discuss why a proof for the more general multilocus case is difficult. Finally, we compare the performance of ML and SDVQuartets using simulation for both data types.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据