4.5 Article

Correcting Bias in Allele Frequency Estimates Due to an Observation Threshold: A Markov Chain Analysis

Journal

GENOME BIOLOGY AND EVOLUTION
Volume 14, Issue 4, Pages -

Publisher

OXFORD UNIV PRESS
DOI: 10.1093/gbe/evac047

Keywords

conditioned observations; missing values; random genetic drift; Wright-Fisher model; population genetics theory; stochastic population dynamics

Funding

  1. Open Access Publication Fund of Bielefeld University

Ask authors/readers for more resources

The article discusses the conditional effect caused by observational thresholds in biology, pointing out the impact on gene frequency trajectories and other areas, especially under purifying selection may lead to significant biases. The study illustrates the measurement of the strength of conditioning and the method of correcting the effect.
There are many problems in biology and related disciplines involving stochasticity, where a signal can only be detected when it lies above a threshold level, while signals lying below threshold are simply not detected. A consequence is that the detected signal is conditioned to lie above threshold, and is not representative of the actual signal. In this work, we present some general results for the conditioning that occurs due to the existence of such an observational threshold. We show that this conditioning is relevant, for example, to gene-frequency trajectories, where many loci in the genome are simultaneously measured in a given generation. Such a threshold can lead to severe biases of allele frequency estimates under purifying selection. In the analysis presented, within the context of Markov chains such as the Wright-Fisher model, we address two key questions: (1) What is a natural measure of the strength of the conditioning associated with an observation threshold? (2) What is a principled way to correct for the effects of the conditioning?. We answer the first question in terms of a proportion. Starting with a large number of trajectories, the relevant quantity is the proportion of these trajectories that are above threshold at a later time and hence are detected. The smaller the value of this proportion, the stronger the effects of conditioning. We provide an approximate analytical answer to the second question, that corrects the bias produced by an observation threshold, and performs to reasonable accuracy in the Wright-Fisher model for biologically plausible parameter values.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available