4.0 Article

An average-case sublinear forward algorithm for the haploid Li and Stephens model

Journal

ALGORITHMS FOR MOLECULAR BIOLOGY
Volume 14, Issue -, Pages -

Publisher

BMC
DOI: 10.1186/s13015-019-0144-9

Keywords

Forward algorithm; Haplotype; Complexity; Sublinear algorithms

Funding

  1. National Human Genome Research Institute of the National Institutes of Health [5U54HG007990]
  2. National Heart, Lung, and Blood Institute of the National Institutes of Health [1U01HL137183-01]
  3. W.M. Keck foundation
  4. Simons Foundation

Ask authors/readers for more resources

BackgroundHidden Markov models of haplotype inheritance such as the Li and Stephens model allow for computationally tractable probability calculations using the forward algorithm as long as the representative reference panel used in the model is sufficiently small. Specifically, the monoploid Li and Stephens model and its variants are linear in reference panel size unless heuristic approximations are used. However, sequencing projects numbering in the thousands to hundreds of thousands of individuals are underway, and others numbering in the millions are anticipated.ResultsTo make the forward algorithm for the haploid Li and Stephens model computationally tractable for these datasets, we have created a numerically exact version of the algorithm with observed average case sublinear runtime with respect to reference panel size k when tested against the 1000 Genomes dataset.ConclusionsWe show a forward algorithm which avoids any tradeoff between runtime and model complexity. Our algorithm makes use of two general strategies which might be applicable to improving the time complexity of other future sequence analysis algorithms: sparse dynamic programming matrices and lazy evaluation.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.0
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available