4.7 Editorial Material

Assessing genome assembly quality prior to downstream analysis: N50 versus BUSCO

期刊

MOLECULAR ECOLOGY RESOURCES
卷 21, 期 5, 页码 1416-1421

出版社

WILEY
DOI: 10.1111/1755-0998.13364

关键词

assembly; bioinformatics; BUSCO; gene mining; genome; N50

向作者/读者索取更多资源

The study found that while assemblies with high contig and scaffold N50 values tend to have high BUSCO scores, a high BUSCO score can also be achieved in assemblies with low N50 values. N50 is not a perfect proxy for all measures of genome accuracy, and assessing gene space in genome assemblies requires appropriate tools as well as reporting additional genome assessment metrics.
With the ever-increasing number of publicly available eukaryotic genome assemblies and user-friendly bioinformatics tools, there are increasing opportunities for researchers to use genomic resources in their research. While there are multiple dimensions to genome quality, it is often reduced to a single score that may not be correlated with other metrics, or appropriate for all applications of an assembly. To assess whether the commonly reported N50 value could reliably predict a separate dimension of genome quality, gene space completeness, we performed a meta-analysis of 611 published articles on eukaryotic genomes that used BUSCO scores, in addition to the typical N50 score. We found that although assemblies with relatively high contig and scaffold N50 values consistently had high BUSCO scores, a high BUSCO score could also be obtained from assemblies with a low N50. This reinforces that despite its ubiquity, N50 is not a perfect proxy for all measures of genome accuracy. Our data also suggests that variations in BUSCO scores among assemblies with poor N50 scores may be related to the number of introns in conserved eukaryotic genes. We stress the importance of screening and evaluating assembly quality based on the appropriate tools and urge increased reporting of additional genome assessment metrics in addition to N50. We also discuss the potential limitations of BUSCO and suggest improvements for assessing gene space within genome assemblies.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据