4.6 Article

Improving variant calling using population data and deep learning

期刊

BMC BIOINFORMATICS
卷 24, 期 1, 页码 -

出版社

BMC
DOI: 10.1186/s12859-023-05294-0

关键词

-

向作者/读者索取更多资源

In this study, population-aware DeepVariant models were developed to improve the accuracy and recall of variant calling in single samples. By using allele frequencies from the 1000 Genomes Project, this model reduced variant calling errors and improved the precision of rare homozygous and pathogenic clinvar calls. The study also found that diverse reference panels were more accurate than population-specific panels, even when the sample ancestry matched the population.
Large-scale population variant data is often used to filter and aid interpretation of variant calls in a single sample. These approaches do not incorporate population information directly into the process of variant calling, and are often limited to filtering which trades recall for precision. In this study, we develop population-aware DeepVariant models with a new channel encoding allele frequencies from the 1000 Genomes Project. This model reduces variant calling errors, improving both precision and recall in single samples, and reduces rare homozygous and pathogenic clinvar calls cohort-wide. We assess the use of population-specific or diverse reference panels, finding the greatest accuracy with diverse panels, suggesting that large, diverse panels are preferable to individual populations, even when the population matches sample ancestry. Finally, we show that this benefit generalizes to samples with different ancestry from the training data even when the ancestry is also excluded from the reference panel.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据