4.6 Article

Automated analysis of phylogenetic clusters

期刊

BMC BIOINFORMATICS
卷 14, 期 -, 页码 -

出版社

BMC
DOI: 10.1186/1471-2105-14-317

关键词

Phylogenetics; Cluster; Sequence analysis; Virus; HIV; Epidemiology

资金

  1. Wellcome Trust [092807]
  2. Biotechnology and Biological Science Research Council [BB/F017030/1]
  3. Medical Research Council [G0900274]
  4. Department of Health
  5. Boehringer Ingelheim
  6. Bristol-Myers Squibb
  7. Gilead
  8. Tibotec (a division of Janssen-Cilag)
  9. Roche
  10. Biotechnology and Biological Sciences Research Council [1098048, 1041234] Funding Source: researchfish
  11. Medical Research Council [G0900274, G0600587] Funding Source: researchfish
  12. MRC [G0600587, G0900274] Funding Source: UKRI

向作者/读者索取更多资源

Background: As sequence data sets used for the investigation of pathogen transmission patterns increase in size, automated tools and standardized methods for cluster analysis have become necessary. We have developed an automated Cluster Picker which identifies monophyletic clades meeting user-input criteria for bootstrap support and maximum genetic distance within large phylogenetic trees. A second tool, the Cluster Matcher, automates the process of linking genetic data to epidemiological or clinical data, and matches clusters between runs of the Cluster Picker. Results: We explore the effect of different bootstrap and genetic distance thresholds on clusters identified in a data set of publicly available HIV sequences, and compare these results to those of a previously published tool for cluster identification. To demonstrate their utility, we then use the Cluster Picker and Cluster Matcher together to investigate how clusters in the data set changed over time. We find that clusters containing sequences from more than one UK location at the first time point (multiple origin) were significantly more likely to grow than those representing only a single location. Conclusions: The Cluster Picker and Cluster Matcher can rapidly process phylogenetic trees containing tens of thousands of sequences. Together these tools will facilitate comparisons of pathogen transmission dynamics between studies and countries.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据