4.8 Article

Multiplex de Bruijn graphs enable genome assembly from long, high-fidelity reads

Journal

NATURE BIOTECHNOLOGY
Volume 40, Issue 7, Pages 1075-+

Publisher

NATURE PORTFOLIO
DOI: 10.1038/s41587-022-01220-6

Keywords

-

Funding

  1. National Science Foundation EAGER award [2032783]
  2. Saint Petersburg State University [PURE 73023672]
  3. Direct For Biological Sciences
  4. Div Of Molecular and Cellular Bioscience [2032783] Funding Source: National Science Foundation

Ask authors/readers for more resources

A multiplex de Bruijn graph algorithm for high-accuracy genome assembly from long, high-fidelity reads is introduced in this article. The algorithm, called La Jolla Assembler (LJA), utilizes Bloom filters, sparse de Bruijn graphs, and disjointig generation to reduce the error rate in HiFi reads and construct the de Bruijn graph for large genomes and k-mer sizes. Compared to state-of-the-art assemblers, LJA achieves fewer misassemblies and generates more contiguous assemblies.
A multiplex de Bruijn graph algorithm allows high-accuracy genome assembly from long, high-fidelity reads. Although most existing genome assemblers are based on de Bruijn graphs, the construction of these graphs for large genomes and large k-mer sizes has remained elusive. This algorithmic challenge has become particularly pressing with the emergence of long, high-fidelity (HiFi) reads that have been recently used to generate a semi-manual telomere-to-telomere assembly of the human genome. To enable automated assemblies of long, HiFi reads, we present the La Jolla Assembler (LJA), a fast algorithm using the Bloom filter, sparse de Bruijn graphs and disjointig generation. LJA reduces the error rate in HiFi reads by three orders of magnitude, constructs the de Bruijn graph for large genomes and large k-mer sizes and transforms it into a multiplex de Bruijn graph with varying k-mer sizes. Compared to state-of-the-art assemblers, our algorithm not only achieves five-fold fewer misassemblies but also generates more contiguous assemblies. We demonstrate the utility of LJA via the automated assembly of a human genome that completely assembled six chromosomes.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.8
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available