Journal
BIOINFORMATICS
Volume 38, Issue 15, Pages 3698-3702Publisher
OXFORD UNIV PRESS
DOI: 10.1093/bioinformatics/btac410
Keywords
-
Categories
Funding
- U.S. Department of Agriculture-Agricultural Research Service, National Science Foundation Research-PGR [IOS-1238014, IOS-1822330]
- Bill and Melinda Gates Foundation [OPP1159867, OPP1175661]
- Bill and Melinda Gates Foundation [OPP1175661, OPP1159867] Funding Source: Bill and Melinda Gates Foundation
Ask authors/readers for more resources
Pangenomes provide novel insights in population and quantitative genetics, genomics, and breeding. However, managing and using pangenomes for genomically diverse species is challenging. We developed a trellis graph representation anchored to the reference genome that accurately represents most pangenomes and can be used to impute complete genomes from low density sequence or variant data.
Motivation: Pangenomes provide novel insights for population and quantitative genetics, genomics and breeding not available from studying a single reference genome. Instead, a species is better represented by a pangenome or collection of genomes. Unfortunately, managing and using pangenomes for genomically diverse species is computationally and practically challenging. We developed a trellis graph representation anchored to the reference genome that represents most pangenomes well and can be used to impute complete genomes from low density sequence or variant data. Results: The Practical Haplotype Graph (PHG) is a pangenome pipeline, database (PostGRES & SQLite), data model (Java, Kotlin or R) and Breeding API (BrAPI) web service. The PHG has already been able to accurately represent diversity in four major crops including maize, one of the most genomically diverse species, with up to 1000-fold data compression. Using simulated data, we show that, at even 0.1x coverage, with appropriate reads and sequence alignment, imputation results in extremely accurate haplotype reconstruction. The PHG is a platform and environment for the understanding and application of genomic diversity.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available