4.8 Article

Accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome

Journal

NATURE BIOTECHNOLOGY
Volume 37, Issue 10, Pages 1155-+

Publisher

NATURE PORTFOLIO
DOI: 10.1038/s41587-019-0217-9

Keywords

-

Funding

  1. National Human Genome Research Institute, National Institutes of Health
  2. NIH [1R01HG010040, UM1 HG008898]
  3. NSFC [31571353, 31822029]
  4. National Science Foundation [DBI-1350041]
  5. National Institutes of Health [R01-HG006677]
  6. German Research Foundation DFG) [391137747, 395192176]
  7. National Institute of Standards and Technology
  8. U.S. Food and Drug Administration
  9. NATIONAL HUMAN GENOME RESEARCH INSTITUTE [ZIAHG200398] Funding Source: NIH RePORTER

Ask authors/readers for more resources

The DNA sequencing technologies in use today produce either highly accurate short reads or less-accurate long reads. We report the optimization of circular consensus sequencing (CCS) to improve the accuracy of single-molecule real-time (SMRT) sequencing (PacBio) and generate highly accurate (99.8%) long high-fidelity (HiFi) reads with an average length of 13.5 kilobases (kb). We applied our approach to sequence the well-characterized human HG002/NA24385 genome and obtained precision and recall rates of at least 99.91% for single-nucleotide variants (SNVs), 95.98% for insertions and deletions <50 bp (indels) and 95.99% for structural variants. Our CCS method matches or exceeds the ability of short-read sequencing to detect small variants and structural variants. We estimate that 2,434 discordances are correctable mistakes in the 'genome in a bottle' (GIAB) benchmark set. Nearly all (99.64%) variants can be phased into haplotypes, further improving variant detection. De novo genome assembly using CCS reads alone produced a contiguous and accurate genome with a contig N50 of >15 megabases (Mb) and concordance of 99.997%, substantially outperforming assembly with less-accurate long reads.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.8
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available