4.7 Article

Benchmarking germline CNV calling tools from exome sequencing data

期刊

SCIENTIFIC REPORTS
卷 11, 期 1, 页码 -

出版社

NATURE PORTFOLIO
DOI: 10.1038/s41598-021-93878-2

关键词

-

资金

  1. Ministry of Science and Higher Education of Russian Federation [075-15-2019-1669]
  2. Russian Foundation for Basic Research [17-29-06063]

向作者/读者索取更多资源

The study aimed to comprehensively analyze tools capable of germline CNV calling using a single CNV standard and reference sample set. Results showed that while most tools could detect a range of CNV lengths, they exhibited low concordance with each other. After unified comparison, it was found that the tools were not equivalent, allowing for the selection of algorithms or ensembles of algorithms most suitable for specific goals.
Whole-exome sequencing is an attractive alternative to microarray analysis because of the low cost and potential ability to detect copy number variations (CNV) of various sizes (from 1-2 exons to several Mb). Previous comparison of the most popular CNV calling tools showed a high portion of false-positive calls. Moreover, due to a lack of a gold standard CNV set, the results are limited and incomparable. Here, we aimed to perform a comprehensive analysis of tools capable of germline CNV calling available at the moment using a single CNV standard and reference sample set. Compiling variants from previous studies with Bayesian estimation approach, we constructed an internal standard for NA12878 sample (pilot National Institute of Standards and Technology Reference Material) including 110,050 CNV or non-CNV exons. The standard was used to evaluate the performance of 16 germline CNV calling tools on the NA12878 sample and 10 correlated exomes as a reference set with respect to length distribution, concordance, and efficiency. Each algorithm had a certain range of detected lengths and showed low concordance with other tools. Most tools are focused on detection of a limited number of CNVs one to seven exons long with a false-positive rate below 50%. EXCAVATOR2, exomeCopy, and FishingCNV focused on detection of a wide range of variations but showed low precision. Upon unified comparison, the tools were not equivalent. The analysis performed allows choosing algorithms or ensembles of algorithms most suitable for a specific goal, e.g. population studies or medical genetics.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据