4.7 Article

The Practical Haplotype Graph, a platform for storing and using pangenomes for imputation

Journal

BIOINFORMATICS
Volume 38, Issue 15, Pages 3698-3702

Publisher

OXFORD UNIV PRESS
DOI: 10.1093/bioinformatics/btac410

Keywords

-

Funding

  1. U.S. Department of Agriculture-Agricultural Research Service, National Science Foundation Research-PGR [IOS-1238014, IOS-1822330]
  2. Bill and Melinda Gates Foundation [OPP1159867, OPP1175661]
  3. Bill and Melinda Gates Foundation [OPP1175661, OPP1159867] Funding Source: Bill and Melinda Gates Foundation

Ask authors/readers for more resources

Pangenomes provide novel insights in population and quantitative genetics, genomics, and breeding. However, managing and using pangenomes for genomically diverse species is challenging. We developed a trellis graph representation anchored to the reference genome that accurately represents most pangenomes and can be used to impute complete genomes from low density sequence or variant data.
Motivation: Pangenomes provide novel insights for population and quantitative genetics, genomics and breeding not available from studying a single reference genome. Instead, a species is better represented by a pangenome or collection of genomes. Unfortunately, managing and using pangenomes for genomically diverse species is computationally and practically challenging. We developed a trellis graph representation anchored to the reference genome that represents most pangenomes well and can be used to impute complete genomes from low density sequence or variant data. Results: The Practical Haplotype Graph (PHG) is a pangenome pipeline, database (PostGRES & SQLite), data model (Java, Kotlin or R) and Breeding API (BrAPI) web service. The PHG has already been able to accurately represent diversity in four major crops including maize, one of the most genomically diverse species, with up to 1000-fold data compression. Using simulated data, we show that, at even 0.1x coverage, with appropriate reads and sequence alignment, imputation results in extremely accurate haplotype reconstruction. The PHG is a platform and environment for the understanding and application of genomic diversity.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available