4.6 Article

Hierarchical phylogenetic models for analyzing multipartite sequence data

Journal

SYSTEMATIC BIOLOGY
Volume 52, Issue 5, Pages 649-664

Publisher

OXFORD UNIV PRESS
DOI: 10.1080/10635150390238879

Keywords

Bayes factor; Cavia; CXCR4/CCR5 coreceptor; HIV evolution; horizontal gene transfer; MCMC; phylogeny

Funding

  1. NCI NIH HHS [CA16042] Funding Source: Medline
  2. NIAID NIH HHS [AI28697] Funding Source: Medline

Ask authors/readers for more resources

Debate exists over how to incorporate information from multipartite sequence data in phylogenetic analyses. Strict combined-data approaches argue for concatenation of all partitions and estimation of one evolutionary history, maximizing the explanatory power of the data. Consensus/independence approaches endorse a two-step procedure where partitions are analyzed independently and then a consensus is determined from the multiple results. Mixtures across the model space of a strict combined-data approach and a priori independent parameters are popular methods to integrate these methods. We propose an alternative middle ground by constructing a Bayesian hierarchical phylogenetic model. Our hierarchical framework enables researchers to pool information across data partitions to improve estimate precision in individual partitions while permitting estimation and testing of tendencies in across-partition quantities. Such across-partition quantities include the distribution from which individual topologies relating the sequences within a partition are drawn. We propose standard hierarchical priors on continuous evolutionary parameters across partitions, while the structure on topologies varies depending on the research problem. We illustrate our model with three examples. We first explore the evolutionary history of the guinea pig (Cavia porcellus) using alignments of 13 mitochondrial genes. The hierarchical model returns substantially more precise continuous parameter estimates than an independent parameter approach without losing the salient features of the data. Second, we analyze the frequency of horizontal gene transfer using 50 prokaryotic genes. We assume an unknown species-level topology and allow individual gene topologies to differ from this with a small estimable probability. Simultaneously inferring the species and individual gene topologies returns a transfer frequency of 17%. We also examine HIV sequences longitudinally sampled from HIV+ patients. We ask whether posttreatment development of CCR5 coreceptor virus represents concerted evolution from middisease CXCR4 virus or reemergence of initial infecting CCR5 virus. The hierarchical model pools partitions from multiple unrelated patients by assuming that the topology for each patient is drawn from a multinomial distribution with unknown probabilities. Preliminary results suggest evolution and not reemergence.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available