4.7 Article

Assessing species coverage and assembly quality of rapidly accumulating sequenced genomes

期刊

GIGASCIENCE
卷 11, 期 -, 页码 -

出版社

OXFORD UNIV PRESS
DOI: 10.1093/gigascience/giac006

关键词

arthropod genomes; biodiversity genomics; BUSCO assessments; genome assembly; genome quality database; reproducible workflow

资金

  1. Novartis Foundation [18B116]
  2. Swiss National Science Foundation [PP00P3_170664, PP00P3_202669]
  3. Swiss National Science Foundation (SNF) [PP00P3_202669, PP00P3_170664] Funding Source: Swiss National Science Foundation (SNF)

向作者/读者索取更多资源

This study presents an automated analysis workflow that surveys and assesses the completeness of genome assemblies from the phylum Arthropoda at the NCBI, and compiles the results into an interactively browsable resource. By using this resource, taxonomic coverage and assembly quality can be examined, and results from different datasets can be compared.
Background Ambitious initiatives to coordinate genome sequencing of Earth's biodiversity mean that the accumulation of genomic data is growing rapidly. In addition to cataloguing biodiversity, these data provide the basis for understanding biological function and evolution. Accurate and complete genome assemblies offer a comprehensive and reliable foundation upon which to advance our understanding of organismal biology at genetic, species, and ecosystem levels. However, ever-changing sequencing technologies and analysis methods mean that available data are often heterogeneous in quality. To guide forthcoming genome generation efforts and promote efficient prioritization of resources, it is thus essential to define and monitor taxonomic coverage and quality of the data. Findings Here we present an automated analysis workflow that surveys genome assemblies from the United States NCBI, assesses their completeness using the relevant BUSCO datasets, and collates the results into an interactively browsable resource. We apply our workflow to produce a community resource of available assemblies from the phylum Arthropoda, the Arthropoda Assembly Assessment Catalogue. Using this resource, we survey current taxonomic coverage and assembly quality at the NCBI, examine how key assembly metrics relate to gene content completeness, and compare results from using different BUSCO lineage datasets. Conclusions These results demonstrate how the workflow can be used to build a community resource that enables large-scale assessments to survey species coverage and data quality of available genome assemblies, and to guide prioritizations for ongoing and future sampling, sequencing, and genome generation initiatives.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据