4.8 Article

Chasing perfection: validation and polishing strategies for telomere-to-telomere genome assemblies

期刊

NATURE METHODS
卷 19, 期 6, 页码 687-+

出版社

NATURE PORTFOLIO
DOI: 10.1038/s41592-022-01440-3

关键词

-

资金

  1. Intramural Research Program of the National Human Genome Research Institute (NHGRI), National Institutes of Health [NIH 1ZIAHG200398]
  2. National Science Foundation [DBI-1350041, IOS-1732253]
  3. NIH/NHGRI [R01HG006677, R01HG010485, U41HG010972, U01HG010961, U24HG011853, OT2OD026682, R01 1R01HG011274-01, R21 1R21HG010548-01, U01 1U01HG010971]
  4. HHMI
  5. Wellcome [WT206194]
  6. NIGMS [F32 GM134558]
  7. St. Petersburg State University [PURE73023672]
  8. Fulbright Fellowship
  9. National Institute of Standards and Technology

向作者/读者索取更多资源

This work describes the validation and polishing strategies developed by the telomere-to-telomere consortium to evaluate and improve the first complete human genome assembly. The study identified small errors and structural misassemblies in the initial assembly and proposed a new repeat-aware polishing strategy to correct these errors, resulting in a significant improvement in assembly quality.
The work describes the validation and polishing strategies developed by the telomere-to-telomere consortium for evaluating and improving the first complete human genome assembly. Advances in long-read sequencing technologies and genome assembly methods have enabled the recent completion of the first telomere-to-telomere human genome assembly, which resolves complex segmental duplications and large tandem repeats, including centromeric satellite arrays in a complete hydatidiform mole (CHM13). Although derived from highly accurate sequences, evaluation revealed evidence of small errors and structural misassemblies in the initial draft assembly. To correct these errors, we designed a new repeat-aware polishing strategy that made accurate assembly corrections in large repeats without overcorrection, ultimately fixing 51% of the existing errors and improving the assembly quality value from 70.2 to 73.9 measured from PacBio high-fidelity and Illumina k-mers. By comparing our results to standard automated polishing tools, we outline common polishing errors and offer practical suggestions for genome projects with limited resources. We also show how sequencing biases in both high-fidelity and Oxford Nanopore Technologies reads cause signature assembly errors that can be corrected with a diverse panel of sequencing technologies.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.8
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据