4.8 Article

Online Bayesian Phylodynamic Inference in BEAST with Application to Epidemic Reconstruction

Journal

MOLECULAR BIOLOGY AND EVOLUTION
Volume 37, Issue 6, Pages 1832-1842

Publisher

OXFORD UNIV PRESS
DOI: 10.1093/molbev/msaa047

Keywords

BEAST; Markov chain Monte Carlo; real-time analysis; Bayesian phylogenetics; pathogen phylodynamics; online inference

Funding

  1. European Research Council under the European Union [725422-ReservoirDOCS]
  2. Wellcome Trust through the ARTIC Network [206298/Z/17/Z]
  3. Special Research Fund, KU Leuven (Bijzonder Onderzoeksfonds, KU Leuven) [OT/14/115]
  4. Research Foundation-Flanders (Fonds voor Wetenschappelijk Onderzoek-Vlaanderen) [G066215N, G0D5117N, G0B9317N, G0E1420N]
  5. NSF [DMS 1264153]
  6. NIH [R01 AI107034, U19 AI135995]
  7. Bill & Melinda Gates Foundation [OPP1175094 PANGEA-2]
  8. Interne Fondsen KU Leuven/Internal Funds KU Leuven [C14/18/094]
  9. Flemish Government-department EWI
  10. Research Foundation-Flanders (FWO)

Ask authors/readers for more resources

Reconstructing pathogen dynamics from genetic data as they become available during an outbreak or epidemic represents an important statistical scenario in which observations arrive sequentially in time and one is interested in performing inference in an online fashion. Widely used Bayesian phylogenetic inference packages are not set up for this purpose, generally requiring one to recompute trees and evolutionary model parameters de novo when new data arrive. To accommodate increasing data flow in a Bayesian phylogenetic framework, we introduce a methodology to efficiently update the posterior distribution with newly available genetic data. Our procedure is implemented in the BEAST 1.10 software package, and relies on a distance-based measure to insert new taxa into the current estimate of the phylogeny and imputes plausible values for new model parameters to accommodate growing dimensionality. This augmentation creates informed starting values and re-uses optimally tuned transition kernels for posterior exploration of growing data sets, reducing the time necessary to converge to target posterior distributions. We apply our framework to data from the recent West African Ebola virus epidemic and demonstrate a considerable reduction in time required to obtain posterior estimates at different time points of the outbreak. Beyond epidemic monitoring, this framework easily finds other applications within the phylogenetics community, where changes in the data-in terms of alignment changes, sequence addition or removal-present common scenarios that can benefit from online inference.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.8
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available