4.7 Article

Assembly-free quantification of vagrant DNA inserts

期刊

MOLECULAR ECOLOGY RESOURCES
卷 23, 期 5, 页码 1002-1013

出版社

WILEY
DOI: 10.1111/1755-0998.13764

关键词

endosymbionts; genome skimming; nuclear pseudogenes; NUMTs; NUPTs; quantification

向作者/读者索取更多资源

This article introduces two statistical methods to estimate the abundance of nuclear inserts even without a nuclear genome assembly. The first method only requires low-coverage sequencing data commonly generated for population studies. The second method additionally requires individuals carrying extranuclear DNA with diverged genotypes. The study demonstrates the utility of low-coverage high-throughput sequencing data for quantifying nuclear vagrant DNAs.
Inserts of DNA from extranuclear sources, such as organelles and microbes, are common in eukaryote nuclear genomes. However, sequence similarity between the nuclear and extranuclear DNA, and a history of multiple insertions, make the assembly of these regions challenging. Consequently, the number, sequence and location of these vagrant DNAs cannot be reliably inferred from the genome assemblies of most organisms. We introduce two statistical methods to estimate the abundance of nuclear inserts even in the absence of a nuclear genome assembly. The first (intercept method) only requires low-coverage (<1x) sequencing data, as commonly generated for population studies of organellar and ribosomal DNAs. The second method additionally requires that a subset of the individuals carry extranuclear DNA with diverged genotypes. We validated our intercept method using simulations and by re-estimating the frequency of human NUMTs (nuclear mitochondrial inserts). We then applied it to the grasshopper Podisma pedestris, exceptional for both its large genome size and reports of numerous NUMT inserts, estimating that NUMTs make up 0.056% of the nuclear genome, equivalent to >500 times the mitochondrial genome size. We also re-analysed a museomics data set of the parrot Psephotellus varius, obtaining an estimate of only 0.0043%, in line with reports from other species of bird. Our study demonstrates the utility of low-coverage high-throughput sequencing data for the quantification of nuclear vagrant DNAs. Beyond quantifying organellar inserts, these methods could also be used on endosymbiont-derived sequences. We provide an R implementation of our methods called vagrantDNA and code to simulate test data sets.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据