期刊
BMC BIOINFORMATICS
卷 19, 期 -, 页码 -出版社
BMC
DOI: 10.1186/s12859-018-2398-5
关键词
Richness estimation; Viral metagenomics; Average genome length
类别
资金
- Australia Research Council [LP140100670, DP150103512]
- Biodiversity Research Center, Academia Sinica, Taiwan
- MIFRS scholarship of The University of Melbourne
- MIRS scholarship of The University of Melbourne
- Australian National University
- Australian Research Council [LP140100670] Funding Source: Australian Research Council
BackgroundEstimating the parameters that describe the ecology of viruses,particularly those that are novel, can be made possible using metagenomic approaches. However, the best-performing existing methods require databases to first estimate an average genome length of a viral community before being able to estimate other parameters, such as viral richness. Although this approach has been widely used, it can adversely skew results since the majority of viruses are yet to be catalogued in databases.ResultsIn this paper, we present ENVirT, a method for estimating the richness of novel viral mixtures, and for the first time we also show that it is possible to simultaneously estimate the average genome length without a priori information. This is shown to be a significant improvement over database-dependent methods, since we can now robustly analyze samples that may include novel viral types under-represented in current databases. We demonstrate that the viral richness estimates produced by ENVirT are several orders of magnitude higher in accuracy than the estimates produced by existing methods named PHACCS and CatchAll when benchmarked against simulated data. We repeated the analysis of 20 metavirome samples using ENVirT, which produced results in close agreement with complementary in virto analyses.ConclusionsThese insights were previously not captured by existing computational methods. As such, ENVirT is shown to be an essential tool for enhancing our understanding of novel viral populations.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据