4.7 Review

Accurate and complete genomes from metagenomes

期刊

GENOME RESEARCH
卷 30, 期 3, 页码 315-333

出版社

COLD SPRING HARBOR LAB PRESS, PUBLICATIONS DEPT
DOI: 10.1101/gr.258640.119

关键词

-

资金

  1. Genome Canada Large-Scale Applied Research Program
  2. Ontario Research Fund: Research Excellence grants
  3. Lawrence Berkeley National Laboratory's Watershed Function Scientific Focus Area - DOE [DE-AC02-05CH11231]
  4. Office of Science and Office of Biological and Environmental Research (Lawrence Berkeley National Laboratory)
  5. National Institutes of Health (NIH) [RAI092531A, R01-GM109454]
  6. Chan Zuckerberg Biohub
  7. UC Berkeley-based Innovative Genomics Institute

向作者/读者索取更多资源

Genomes are an integral component of the biological information about an organism; thus, the more complete the genome, the more informative it is. Historically, bacterial and archaeal genomes were reconstructed from pure (monoclonal) cultures, and the first reported sequences were manually curated to completion. However, the bottleneck imposed by the requirement for isolates precluded genomic insights for the vast majority of microbial life. Shotgun sequencing of microbial communities, referred to initially as community genomics and subsequently as genome-resolved metagenomics, can circumvent this limitation by obtaining metagenome-assembled genomes (MAGs); but gaps, local assembly errors, chimeras, and contamination by fragments from other genomes limit the value of these genomes. Here, we discuss genome curation to improve and, in some cases, achieve complete (circularized, no gaps) MAGs (CMAGs). To date, few CMAGs have been generated, although notably some are from very complex systems such as soil and sediment. Through analysis of about 7000 published complete bacterial isolate genomes, we verify the value of cumulative GC skew in combination with other metrics to establish bacterial genome sequence accuracy. The analysis of cumulative GC skew identified potential misassemblies in some reference genomes of isolated bacteria and the repeat sequences that likely gave rise to them. We discuss methods that could be implemented in bioinformatic approaches for curation to ensure that metabolic and evolutionary analyses can be based on very high-quality genomes.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据