4.7 Article Data Paper

Highly accurate long-read HiFi sequencing data for five complex genomes

期刊

SCIENTIFIC DATA
卷 7, 期 1, 页码 -

出版社

NATURE RESEARCH
DOI: 10.1038/s41597-020-00743-4

关键词

-

资金

  1. United Stated Department of Agriculture National Institute of Food and Agriculture (NIFA) Specialty Crops Research Initiative [2017-51181-26833]
  2. California Strawberry Commission
  3. University of California
  4. USDA-ARS [8062-21000-041]
  5. NSF [IOS-1744001]

向作者/读者索取更多资源

The PacBio(R) HiFi sequencing method yields highly accurate long-read sequencing datasets with read lengths averaging 10-25 kb and accuracies greater than 99.5%. These accurate long reads can be used to improve results for complex applications such as single nucleotide and structural variant detection, genome assembly, assembly of difficult polyploid or highly repetitive genomes, and assembly of metagenomes. Currently, there is a need for sample data sets to both evaluate the benefits of these long accurate reads as well as for development of bioinformatic tools including genome assemblers, variant callers, and haplotyping algorithms. We present deep coverage HiFi datasets for five complex samples including the two inbred model genomes Mus musculus and Zea mays, as well as two complex genomes, octoploid Fragaria x ananassa and the diploid anuran Rana muscosa. Additionally, we release sequence data from a mock metagenome community. The datasets reported here can be used without restriction to develop new algorithms and explore complex genome structure and evolution. Data were generated on the PacBio Sequel II System.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据