4.7 Article

Fast and accurate distance-based phylogenetic placement using divide and conquer

期刊

MOLECULAR ECOLOGY RESOURCES
卷 22, 期 3, 页码 1213-1227

出版社

WILEY
DOI: 10.1111/1755-0998.13527

关键词

distance-based methods; metagenomics; microbiome; phylogenetic placement

资金

  1. National Science Foundation (NSF) [IIS-1565862, NSF-1815485, ACI-1053575]
  2. 2020 UCSD Center for Microbiome Innovation Grand Challenge Award
  3. Arizona State University

向作者/读者索取更多资源

This study introduces a distance-based phylogenetic placement method called APPLES-2, which is more accurate and scalable than existing methods. Through validation using a large dataset, it is shown that 97% of query genomes can be accurately placed within three branches of the optimal position in the species tree using 50 marker genes.
Phylogenetic placement of query samples on an existing phylogeny is increasingly used in molecular ecology, including sample identification and microbiome environmental sampling. As the size of available reference trees used in these analyses continues to grow, there is a growing need for methods that place sequences on ultra-large trees with high accuracy. Distance-based placement methods have recently emerged as a path to provide such scalability while allowing flexibility to analyse both assembled and unassembled environmental samples. In this study, we introduce a distance-based phylogenetic placement method, APPLES-2, that is more accurate and scalable than existing distance-based methods and even some of the leading maximum-likelihood methods. This scalability is owed to a divide-and-conquer technique that limits distance calculation and phylogenetic placement to parts of the tree most relevant to each query. The increased scalability and accuracy enables us to study the effectiveness of APPLES-2 for placing microbial genomes on a data set of 10,575 microbial species using subsets of 381 marker genes. APPLES-2 has very high accuracy in this setting, placing 97% of query genomes within three branches of the optimal position in the species tree using 50 marker genes. Our proof-of-concept results show that APPLES-2 can quickly place metagenomic scaffolds on ultra-large backbone trees with high accuracy as long as a scaffold includes tens of marker genes. These results pave the path for a more scalable and widespread use of distance-based placement in various areas of molecular ecology.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据