☆ 4.6 Article

Distance-Based Phylogenetic Placement with Statistical Support

BIOLOGY-BASEL (2022)

期刊

BIOLOGY-BASEL

卷 11, 期 8, 页码 -

出版社

MDPI

DOI: 10.3390/biology11081212

关键词

phylogenetic placement; statistical support; distance-based phylogenetic inference; bootstrapping

类别

Biology

资金

National Institute of Health [1R35GM142725]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

Phylogenetic identification of unknown sequences through tree placement is commonly used in ecological studies. This article addresses the issue of uncertainty in placements obtained from incomplete and noisy data. Nonparametric bootstrapping is found to be the most accurate method for measuring support, and an efficient linear algebraic formulation for bootstrapping is presented. The article also compares the accuracy of maximum likelihood support values and distance-based methods in different applications and datasets.

Phylogenetic identification of unknown sequences by placing them on a tree is routinely attempted in modern ecological studies. Such placements are often obtained from incomplete and noisy data, making it essential to augment the results with some notion of uncertainty. While the standard likelihood-based methods designed for placement naturally provide such measures of uncertainty, the newer and more scalable distance-based methods lack this crucial feature. Here, we adopt several parametric and nonparametric sampling methods for measuring the support of phylogenetic placements that have been obtained with the use of distances. Comparing the alternative strategies, we conclude that nonparametric bootstrapping is more accurate than the alternatives. We go on to show how bootstrapping can be performed efficiently using a linear algebraic formulation that makes it up to 30 times faster and implement this optimized version as part of the distance-based placement software APPLES. By examining a wide range of applications, we show that the relative accuracy of maximum likelihood (ML) support values as compared to distance-based methods depends on the application and the dataset. ML is advantageous for fragmentary queries, while distance-based support values are more accurate for full-length and multi-gene datasets. With the quantification of uncertainty, our work fills a crucial gap that prevents the broader adoption of distance-based placement tools.

Distance-Based Phylogenetic Placement with Statistical Support

期刊

BIOLOGY-BASEL

出版社

MDPI

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Distance-Based Phylogenetic Placement with Statistical Support

期刊

BIOLOGY-BASEL

出版社

MDPI

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文