4.6 Article

Sampling bias and model choice in continuous phylogeography: Getting lost on a random walk

期刊

PLOS COMPUTATIONAL BIOLOGY
卷 17, 期 1, 页码 -

出版社

PUBLIC LIBRARY SCIENCE
DOI: 10.1371/journal.pcbi.1008561

关键词

-

资金

  1. Cambridge Mathematics Placements (CMP)
  2. Interne Fondsen KU Leuven/Internal Funds KU Leuven [C14/18/094]
  3. Research Foundation -Flanders (`Fonds voor Wetenschappelijk Onderzoek -Vlaanderen')
  4. Agence Nationale pour la Recherche through the grant GENOSPACE
  5. European Molecular Biology Laboratory

向作者/读者索取更多资源

The author explores the effects of different model assumptions on phylogeographic inference and discovers that sample collection biases can strongly impact the quality of reconstruction. They suggest various strategies to counter these effects, but note that they come with additional computational burden. Additionally, they investigate the differences of various phylogeographic models and their suitability in different scenarios.
Author summary Phylogeography studies past location and migration using information from current geographic locations of genetic sequences. For example, phylogeography can be used to reconstruct the history of geographical spread of an outbreak using the genetic sequences of the pathogen collected at different times and locations. Here, we investigate the effects of different model assumptions on phylogeographic inference. In particular, we examine the effects of the strategy used to collect samples. We show that sample collection biases can have a strong impact on the quality of phylogeographic reconstruction: geographically biased sampling scheme can be very detrimental for popular continuous phylogeography models. We consider different ways to counter these effects, from utilising alternative phylogeographic models, to the inclusion of partially informative samples (known cases without genetic sequences). While these strategies do alleviate the effects of sampling biases, they also lead to considerable additional computational burden. We also investigate the intrinsic differences of different phylogeographic models, and their effects on reconstructed patterns in different scenarios. Phylogeographic inference allows reconstruction of past geographical spread of pathogens or living organisms by integrating genetic and geographic data. A popular model in continuous phylogeography-with location data provided in the form of latitude and longitude coordinates-describes spread as a Brownian motion (Brownian Motion Phylogeography, BMP) in continuous space and time, akin to similar models of continuous trait evolution. Here, we show that reconstructions using this model can be strongly affected by sampling biases, such as the lack of sampling from certain areas. As an attempt to reduce the effects of sampling bias on BMP, we consider the addition of sequence-free samples from under-sampled areas. While this approach alleviates the effects of sampling bias, in most scenarios this will not be a viable option due to the need for prior knowledge of an outbreak's spatial distribution. We therefore consider an alternative model, the spatial ?-Fleming-Viot process (?FV), which has recently gained popularity in population genetics. Despite the ?FV's robustness to sampling biases, we find that the different assumptions of the ?FV and BMP models result in different applicabilities, with the ?FV being more appropriate for scenarios of endemic spread, and BMP being more appropriate for recent outbreaks or colonizations.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据