4.5 Article

Mash Screen: high-throughput sequence containment estimation for genome discovery

期刊

GENOME BIOLOGY
卷 20, 期 1, 页码 -

出版社

BMC
DOI: 10.1186/s13059-019-1841-x

关键词

MinHash; Metagenomics; Sequencing; SRA; Viral Discovery; Polyomavirus

资金

  1. Intramural Research Programs of the National Human Genome Research Institute
  2. National Cancer Institute, National Institutes of Health

向作者/读者索取更多资源

The MinHash algorithm has proven effective for rapidly estimating the resemblance of two genomes or metagenomes. However, this method cannot reliably estimate the containment of a genome within a metagenome. Here, we describe an online algorithm capable of measuring the containment of genomes and proteomes within either assembled or unassembled sequencing read sets. We describe several use cases, including contamination screening and retrospective analysis of metagenomes for novel genome discovery. Using this tool, we provide containment estimates for every NCBI RefSeq genome within every SRA metagenome and demonstrate the identification of a novel polyomavirus species from a public metagenome.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据