4.6 Article

Convergence assessment for Bayesian phylogenetic analysis using MCMC simulation

Journal

METHODS IN ECOLOGY AND EVOLUTION
Volume 13, Issue 1, Pages 77-90

Publisher

WILEY
DOI: 10.1111/2041-210X.13727

Keywords

Bayesian inference; convergence assessment; Markov chain Monte Carlo; phylogeny

Categories

Funding

  1. Deutsche Forschungsgemeinschaft [HO 6201/1-1]

Ask authors/readers for more resources

This study explores different methods for assessing convergence in phylogenetics, including deriving a threshold for minimum effective sample size and converting tree samples into traces of absence/presence of splits for standard ESS computation. The Kolmogorov-Smirnov test is suggested for assessing convergence in distribution between replicated MCMC runs, while potential scale reduction factor is deemed biased for skewed posterior distributions. Additionally, the study introduces a method for computing distribution of differences in split frequencies, highlighting the importance of using the 95% quantile for checking convergence in split frequencies.
Posterior distributions are commonly approximated by samples produced from a Markov chain Monte Carlo (MCMC) simulation. Every MCMC simulation has to be checked for convergence, that is, that sufficiently many samples have been obtained and that these samples indeed represent the true posterior distribution. Here we develop and test different approaches for convergence assessment in phylogenetics. We analytically derive a threshold for a minimum effective sample size (ESS) of 625. We observe that only the initial sequence estimator provides robust ESS estimates for common types of MCMC simulations (autocorrelated samples, adaptive MCMC, Metropolis-coupled MCMC). We show that standard ESS computation can be applied to phylogenetic trees if the tree samples are converted into traces of absence/presence of splits. Convergence in distribution between replicated MCMC runs can be assessed with the Kolmogorov-Smirnov test. The commonly used potential scale reduction factor (PSRF) is biased when applied to skewed posterior distribution. Additionally, we provide how the distribution of differences in split frequencies can be computed exactly akin to standard exact tests and show that it depends on the true frequency of a split. Hence, the average standard deviation of split frequencies is too simplistic and the expected difference based on the 95% quantile should be used instead to check for convergence in split frequencies. We implemented the methods described here in the open-source R package Convenience (), which allows users to easily test for convergence using output from standard phylogenetic inference software.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available