4.6 Article

New algorithms for accurate and efficient de novo genome assembly from long DNA sequencing reads

期刊

LIFE SCIENCE ALLIANCE
卷 6, 期 5, 页码 -

出版社

LIFE SCIENCE ALLIANCE LLC
DOI: 10.26508/lsa.202201719

关键词

-

类别

向作者/读者索取更多资源

Building de novo genome assemblies for complex genomes is possible with long-read DNA sequencing technologies. New algorithms are developed to assemble long DNA sequencing reads from haploid and diploid organisms. The algorithms demonstrate competitive accuracy and computational efficiency compared with other software currently used. This development is expected to be useful for researchers building genome assemblies for different species.
Building de novo genome assemblies for complex genomes is possible thanks to long-read DNA sequencing technologies. However, maximizing the quality of assemblies based on long reads is a challenging task that requires the development of specialized data analysis techniques. We present new algorithms for assembling long DNA sequencing reads from haploid and diploid organisms. The assembly algorithm builds an undirected graph with two vertices for each read based on minimizers se-lected by a hash function derived from the k-mer distribution. Statistics collected during the graph construction are used as features to build layout paths by selecting edges, ranked by a likelihood function. For diploid samples, we integrated a reim-plementation of the ReFHap algorithm to perform molecular phasing. We ran the implemented algorithms on PacBio HiFi and Nanopore sequencing data taken from haploid and diploid samples of different species. Our algorithms showed competitive accuracy and computational efficiency, compared with other currently used software. We expect that this new development will be useful for researchers building genome assemblies for different species.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据