4.6 Article

Multilocus phylogenetic analysis with gene tree clustering

期刊

ANNALS OF OPERATIONS RESEARCH
卷 276, 期 1-2, 页码 293-313

出版社

SPRINGER
DOI: 10.1007/s10479-017-2456-9

关键词

Phylogenetics; Normalized cut; Clustering

资金

  1. JSPS KAKENHI [26540016]
  2. ND EPSCoR NSF [1355466]
  3. Office of Integrative Activities
  4. Office Of The Director [1355466] Funding Source: National Science Foundation
  5. Grants-in-Aid for Scientific Research [26540016, 26280009] Funding Source: KAKEN

向作者/读者索取更多资源

Both theoretical and empirical evidence point to the fact that phylogenetic trees of different genes (loci) do not display precisely matched topologies. Nonetheless, most genes do display related phylogenies; this implies they form cohesive subsets (clusters). In this work, we discuss gene tree clustering, focusing on the normalized cut (Ncut) framework as a suitable method for phylogenetics. We proceed to show that this framework is both efficient and statistically accurate when clustering gene trees using the geodesic distance between them over the Billera-Holmes-Vogtmann tree space. We also conduct a computational study on the performance of different clustering methods, with and without preprocessing, under different distance metrics, and using a series of dimensionality reduction techniques. Our results with simulated data reveal that Ncut accurately clusters the set of gene trees, given a species tree under the coalescent process. Other observations from our computational study include the similar performance displayed by Ncut and k-means under most dimensionality reduction schemes, the worse performance of hierarchical clustering, and the significantly better performance of the neighbor-joining method with the p-distance compared to the maximum-likelihood estimation method. Supplementary material, all codes, and the data used in this work are freely available at http://polytopes.net/research/cluster/ online.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据