期刊
JOURNAL OF COMPUTATIONAL BIOLOGY
卷 29, 期 9, 页码 987-1000出版社
MARY ANN LIEBERT, INC
DOI: 10.1089/cmb.2021.0269
关键词
germline variant; next-generation sequencing; read depth distribution; somatic variant; variant calling; variant filtering
A new method called RDscan has been developed to improve the accuracy of germline and somatic variant calling in NGS data by calculating RDscore and removing false-positive variant calls. Testing showed that RDscan significantly improved accuracy for most algorithms, particularly in enhancing the accuracy of somatic variants.
Several tools have been developed for calling variants from next-generation sequencing (NGS) data. Although they are generally accurate and reliable, most of them have room for improvement, especially regarding calling variants in datasets with low read depth. In addition, the somatic variants predicted by several somatic variant callers tend to have very low concordance rates. In this study, we developed a new method (RDscan) for improving germline and somatic variant calling in NGS data. RDscan removes misaligned reads, repositions reads, and calculates RDscore based on the read depth distribution. With RDscore, RDscan improves the precision of variant callers by removing false-positive variant calls. When we tested our new tool using the latest variant calling algorithms and data from the 1000 Genomes Project and Illumina's public datasets, accuracy was improved for most of the algorithms. After screening variants with RDscan, calling accuracies increased for germline variants in 11 of 12 cases and for somatic variants in 21 of 24 cases. RDscan is simple to use and can effectively remove false-positive variants while maintaining a low computation load. Therefore, RDscan, along with existing variant callers, should contribute to improvements in genome analysis.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据