4.2 Article

HLA haplotype frequency estimation for heterogeneous populations using a graph-based imputation algorithm

Journal

HUMAN IMMUNOLOGY
Volume 82, Issue 10, Pages 746-757

Publisher

ELSEVIER SCIENCE INC
DOI: 10.1016/j.humimm.2021.07.001

Keywords

Multi-region expectation-maximization; algorithm; HLA; Haplotype frequencies

Categories

Ask authors/readers for more resources

A novel multi-region EM implementation that integrates HLA information from different population groups has been developed, showing higher likelihood values and better haplotype recovery than other EM implementations when tested on real and simulated datasets. This approach addresses the need for improved HLA imputation and matching for multi-region populations.
HLA haplotype frequencies are estimated from ambiguous unphased HLA genotyping data using Expectation-Maximization (EM) algorithms. Current population genetics methods require independent EM frequency estimates for each population, and assume that each population is in Hardy-Weinberg Equilibrium (HWE). The HWE assumption of EM has thus far resulted in the exclusion of individuals from mixed or unknown ethnic backgrounds from reference datasets. Multi-region populations are currently poorly served by stem cell donor registry HLA imputation and matching implementations due to the inability of such algorithms to incorporate admixture into their population genetics models. To address this unmet need, we have expanded the imputation component of our GRaph IMputation and Matching (GRIMM) framework, where imputation becomes the expectation step in an iterative EM algorithm. Our novel multi-region EM implementation considers region as a Bayesian prior, enabling integration of HLA information from multiple single-region population groups, and for the first time including individuals with ambiguous or mixed ethnic backgrounds. We show that our multi-region EM produces much higher likelihood values and better haplotype recovery as measured by Kullback-Leibler divergence than all evaluated EM implementations when tested on real datasets of US donor registry HLA typings as well as simulated multi-region datasets of ambiguous HLA typings. (c) 2021 American Society for Histocompatibility and Immunogenetics. Published by Elsevier Inc. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.2
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available