4.7 Article Proceedings Paper

TreeShrink: fast and accurate detection of outlier long branches in collections of phylogenetic trees

期刊

BMC GENOMICS
卷 19, 期 -, 页码 -

出版社

BIOMED CENTRAL LTD
DOI: 10.1186/s12864-018-4620-2

关键词

Tree diameter; Rogue taxon removal; Gene tree discordance

资金

  1. NSF [IIS-1565862, ACI-1053575]
  2. National Institutes of Health (NIH) [5P30AI027767-28]
  3. Direct For Computer & Info Scie & Enginr
  4. Div Of Information & Intelligent Systems [1565862] Funding Source: National Science Foundation

向作者/读者索取更多资源

Background: Sequence data used in reconstructing phylogenetic trees may include various sources of error. Typically errors are detected at the sequence level, but when missed, the erroneous sequences often appear as unexpectedly long branches in the inferred phylogeny. Results: We propose an automatic method to detect such errors. We build a phylogeny including all the data then detect sequences that artificially inflate the tree diameter. We formulate an optimization problem, called the k-shrink problem, that seeks to find k leaves that could be removed to maximally reduce the tree diameter. We present an algorithm to find the exact solution for this problem in polynomial time. We then use several statistical tests to find outlier species that have an unexpectedly high impact on the tree diameter. These tests can use a single tree or a set of related gene trees and can also adjust to species-specific patterns of branch length. The resulting method is called TreeShrink. We test our method on six phylogenomic biological datasets and an HIV dataset and show that the method successfully detects and removes long branches. TreeShrink removes sequences more conservatively than rogue taxon removal and often reduces gene tree discordance more than rogue taxon removal once the amount of filtering is controlled. Conclusions: TreeShrink is an effective method for detecting sequences that lead to unrealistically long branch lengths in phylogenetic trees. The tool is publicly available at https://github.com/uym2/TreeShrink.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据