4.5 Article

VariantStore: an index for large-scale genomic variant search

期刊

GENOME BIOLOGY
卷 22, 期 1, 页码 -

出版社

BMC
DOI: 10.1186/s13059-021-02442-8

关键词

Variation graph; Graph genomes; Pangenomes

资金

  1. Gordon and Betty Moore Foundation's Data-Driven Discovery Initiative [GBMF4554]
  2. US National Institutes of Health [R01GM122935]
  3. Advanced Scientific Computing Research (ASCR) program within the Office of Science of the DOE [DE-AC02-05CH11231, 17-SC-20-SC]
  4. National Nuclear Security Administration

向作者/读者索取更多资源

VariantStore efficiently indexes genomic variants from multiple samples using a variation graph, enabling variant queries across different sample-specific coordinate systems. It demonstrates scalability by indexing genomic variants from the TCGA and 1000 Genomes projects in a short amount of time, with queries for gene variants taking between 0.002 and 3 seconds, using only 10% of the full representation's memory.
Efficiently scaling genomic variant search indexes to thousands of samples is computationally challenging due to the presence of multiple coordinate systems to avoid reference biases. We present VariantStore, a system that indexes genomic variants from multiple samples using a variation graph and enables variant queries across any sample-specific coordinate system. We show the scalability of VariantStore by indexing genomic variants from the TCGA project in 4 h and the 1000 Genomes project in 3 h. Querying for variants in a gene takes between 0.002 and 3 seconds using memory only 10% of the size of the full representation.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据