4.5 Article

Benchmarking of next and third generation sequencing technologies and their associated algorithms for de novo genome assembly

期刊

MOLECULAR MEDICINE REPORTS
卷 23, 期 4, 页码 -

出版社

SPANDIDOS PUBL LTD
DOI: 10.3892/mmr.2021.11890

关键词

de novo genome assembly; next generation sequencing; third generation sequencing; genomics; benchmarking; bioinformatics

向作者/读者索取更多资源

Genome assemblers are computational tools used for de novo genome assembly, evaluating quality based on contiguity and misassembly occurrences. The advancement in sequencing technologies has led to the development of novel strategies aiming to address weaknesses and create complete genome maps. Different assembly strategies were benchmarked in this study, showing that HiFi sequencing enables the rise of new algorithms and may democratize genome assembly projects for smaller labs with limited resources.
Genome assemblers are computational tools for de novo genome assembly, based on a plenitude of primary sequencing data. The quality of genome assemblies is estimated by their contiguity and the occurrences of misassemblies (duplications, deletions, translocations or inversions). The rapid development of sequencing technologies has enabled the rise of novel de novo genome assembly strategies. The ultimate goal of such strategies is to utilise the features of each sequencing platform in order to address the existing weaknesses of each sequencing type and compose a complete and correct genome map. In the present study, the hybrid strategy, which is based on Illumina short paired-end reads and Nanopore long reads, was benchmarked using MaSuRCA and Wengan assemblers. Moreover, the long-read assembly strategy, which is based on Nanopore reads, was benchmarked using Canu or PacBio HiFi reads were benchmarked using Hifiasm and HiCanu. The assemblies were performed on a computational cluster with limited computational resources. Their outputs were evaluated in terms of accuracy and computational performance. PacBio HiFi assembly strategy outperforms the other ones, while Hi-C scaffolding, which is based on chromatin 3D structure, is required in order to increase continuity, accuracy and completeness when large and complex genomes, such as the human one, are assembled. The use of Hi-C data is also necessary while using the hybrid assembly strategy. The results revealed that HiFi sequencing enabled the rise of novel algorithms which require less genome coverage than that of the other strategies making the assembly a less computationally demanding task. Taken together, these developments may lead to the democratisation of genome assembly projects which are now approachable by smaller labs with limited technical and financial resources.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据