4.8 Article

Progressive Cactus is a multiple-genome aligner for the thousand-genome era

期刊

NATURE
卷 587, 期 7833, 页码 246-+

出版社

NATURE PORTFOLIO
DOI: 10.1038/s41586-020-2871-y

关键词

-

向作者/读者索取更多资源

New genome assemblies have been arriving at a rapidly increasing pace, thanks to decreases in sequencing costs and improvements in third-generation sequencing technologies(1-3). For example, the number of vertebrate genome assemblies currently in the NCBI (National Center for Biotechnology Information) database(4) increased by more than 50% to 1,485 assemblies in the year from July 2018 to July 2019. In addition to this influx of assemblies from different species, new human de novo assemblies(5) are being produced, which enable the analysis of not only small polymorphisms, but also complex, large-scale structural differences between human individuals and haplotypes. This coming era and its unprecedented amount of data offer the opportunity to uncover many insights into genome evolution but also present challenges in how to adapt current analysis methods to meet the increased scale. Cactus(6), a reference-free multiple genome alignment program, has been shown to be highly accurate, but the existing implementation scales poorly with increasing numbers of genomes, and struggles in regions of highly duplicated sequences. Here we describe progressive extensions to Cactus to create Progressive Cactus, which enables the reference-free alignment of tens to thousands of large vertebrate genomes while maintaining high alignment quality. We describe results from an alignment of more than 600 amniote genomes, which is to our knowledge the largest multiple vertebrate genome alignment created so far. The Progressive Cactus program can create reference-free alignments of hundreds of large vertebrate genomes efficiently, and is used for the alignment of more than 600 amniote genomes.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.8
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据