4.8 Article

Targeted de novo phasing and long-range assembly by template mutagenesis

Journal

NUCLEIC ACIDS RESEARCH
Volume 50, Issue 18, Pages -

Publisher

OXFORD UNIV PRESS
DOI: 10.1093/nar/gkac592

Keywords

-

Funding

  1. Simons Foundation [SFARI] [SFARI 497800]
  2. Simons Foundation, Life Sciences Founders Directed Giving-Research [519054]
  3. NHGRI through the NYGC [3UM1HG008901-04S2]

Ask authors/readers for more resources

In this study, a novel method for haplotype-phased sequence assembly using short-read sequencers is described. With the introduction of mutation patterns and correction strategies, it achieves accurate assembly and phasing of long genomic regions with low error rates.
Short-read sequencers provide highly accurate reads at very low cost. Unfortunately, short reads are often inadequate for important applications such as assembly in complex regions or phasing across distant heterozygous sites. In this study, we describe novel bench protocols and algorithms to obtain haplotype-phased sequence assemblies with ultra-low error for regions 10 kb and longer using short reads only. We accomplish this by imprinting each template strand from a target region with a dense and unique mutation pattern. The mutation process randomly and independently converts similar to 50% of cytosines to uracils. Sequencing libraries are made from both mutated and unmutated templates. Using de Bruijn graphs and paired-end read information, we assemble each mutated template and use the unmutated library to correct the mutated bases. Templates are partitioned into two or more haplotypes, and the final haplotypes are assembled and corrected for residual template mutations and PCR errors. With sufficient template coverage, the final assemblies have per-base error rates below 10(-9). We demonstrate this method on a four-member nuclear family, correctly assembling and phasing three genomic intervals, including the highly polymorphic HLA-B gene.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.8
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available