4.4 Article

Imputing Genotypes in Biallelic Populations from Low-Coverage Sequence Data

期刊

GENETICS
卷 202, 期 2, 页码 487-+

出版社

GENETICS SOCIETY AMERICA
DOI: 10.1534/genetics.115.182071

关键词

hidden Markov models; imputation; next-generation sequencing; population genetics; plant genomics

资金

  1. National Science Foundation (NSF) [0965420, 1419501]
  2. National Institutes of Health (NIH)
  3. Bill and Melinda Gates Foundation
  4. NIH Biomedical Informatics Research Training grant
  5. NSF
  6. NIH [R01-GM59507, RR-19895, RR-029676-01]
  7. Direct For Biological Sciences
  8. Division Of Integrative Organismal Systems [0965420] Funding Source: National Science Foundation

向作者/读者索取更多资源

Low-coverage next-generation sequencing methodologies are routinely employed to genotype large populations. Missing data in these populations manifest both as missing markers and markers with incomplete allele recovery. False homozygous calls at heterozygous sites resulting from incomplete allele recovery confound many existing imputation algorithms. These types of systematic errors can be minimized by incorporating depth-of-sequencing read coverage into the imputation algorithm. Accordingly, we developed Low-Coverage Biallelic Impute (LB-Impute) to resolve missing data issues. LB-Impute uses a hidden Markov model that incorporates marker read coverage to determine variable emission probabilities. Robust, highly accurate imputation results were reliably obtained with LB-Impute, even at extremely low (<1x3) average per-marker coverage. This finding will have implications for the design of genotype imputation algorithms in the future. LB-Impute is publicly available on GitHub at https://github.com/dellaportalaboratory/LB-Impute.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.4
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据