4.7 Article

A survey of direct-to-consumer genotype data, and quality control tool (GenomePrep) for research

期刊

出版社

ELSEVIER
DOI: 10.1016/j.csbj.2021.06.040

关键词

Genotyping; Direct-to-consumer sequencing; Open genome; Personal genome; SNP arrays

资金

  1. Medical Research Council, United Kingdom Research and Innovation (UK Research and Innovation) [MC_UP_1201/14]

向作者/读者索取更多资源

The rapid growth of human genetic data is driven by medical research and direct-to-consumer sequencing companies. A review of over 7000 genomes is provided, along with a toolkit for preparing consumer DNA datasets for research purposes.
Two major forces have contributed to the fast growth of human genetic data. One from medical research supported by governments and academic institutes; the other from direct-to-consumer (DTC) sequencing companies. While the former benefits from meticulously designed sequencing standards and quality control procedures, the latter comes in various formats and sequencing methods which are subject to changes over time and the particular needs of different companies. Thanks to the general public who shared their DNA data without constraint, here we provide a review for over 7000 genomes made public between 2011 and 2020, and produced by over six DTC sequencing companies. An open source tool-kit to systematically parse, quality check and filter genome files and statistically problematic alleles is provided to prepare consumer DNA datasets for research. The GenomePrep output is available in two common DNA datafile formats to enable further analysis with other tools. We also provide for download the combined output for all OpenSNP array genomes processed in this paper in a single data freeze file. (C) 2021 MRC Laboratory of Molecular Biology. Published by Elsevier B.V. on behalf of Research Network of Computational and Structural Biotechnology.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据