☆ 4.6 Article

Evaluation of vicinity-based hidden Markov models for genotype imputation

BMC BIOINFORMATICS (2022)

期刊

BMC BIOINFORMATICS

卷 23, 期 1, 页码 -

出版社

BMC

DOI: 10.1186/s12859-022-04896-4

关键词

Genotype imputation; Hidden Markov models; Forward-Backward algorithm; Viterbi algorithm

类别

Biochemical Research Methods Biotechnology & Applied Microbiology Mathematical & Computational Biology

资金

University of Texas Health Science Center, Houston
Settlement Research Fund of UNIST(Ulsan National Institute of Science and Technology) [1.200109.01]
Institute of Information and communications Technology Planning and Evaluation(IITP) - Korea government (MSIT) [2020-0-01336]
Christopher Sarofim Family Professorship
CPRIT Scholar in Cancer Research [RR180012]
UTHealth startup
National Institute of Health (NIH) [R13HG009072, R01GM114612]
National Science Foundation (NSF) [2027790]
UT Stars award

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

The decreasing cost of DNA sequencing has led to increased knowledge about genetic variation. In-silico genotype imputation using reference panels is a cost-effective and accurate alternative for genotyping common and uncommon variants. This study assesses the accuracy of using vicinity-based hidden Markov models (HMMs) for imputation and demonstrates that it can accurately impute both common and uncommon variants.

Background: The decreasing cost of DNA sequencing has led to a great increase in our knowledge about genetic variation. While population-scale projects bring important insight into genotype-phenotype relationships, the cost of performing whole-genome sequencing on large samples is still prohibitive. In-silico genotype imputation coupled with genotyping-by-arrays is a cost-effective and accurate alternative for genotyping of common and uncommon variants. Imputation methods compare the genotypes of the typed variants with the large population-specific reference panels and estimate the genotypes of untyped variants by making use of the linkage disequilibrium patterns. Most accurate imputation methods are based on the Li-Stephens hidden Markov model, HMM, that treats the sequence of each chromosome as a mosaic of the haplotypes from the reference panel. Results: Here we assess the accuracy of vicinity-based HMMs, where each untyped variant is imputed using the typed variants in a small window around itself (as small as 1 centimorgan). Locality-based imputation is used recently by machine learning-based genotype imputation approaches. We assess how the parameters of the vicinity-based HMMs impact the imputation accuracy in a comprehensive set of benchmarks and show that vicinity-based HMMs can accurately impute common and uncommon variants. Conclusions: Our results indicate that locality-based imputation models can be effectively used for genotype imputation. The parameter settings that we identified can be used in future methods and vicinity-based HMMs can be used for re-structuring and parallelizing new imputation methods. The source code for the vicinity-based HMM implementations is publicly available at https://github.com/harmancilab/LoHaMMer.

Evaluation of vicinity-based hidden Markov models for genotype imputation

期刊

BMC BIOINFORMATICS

出版社

BMC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Evaluation of vicinity-based hidden Markov models for genotype imputation

期刊

BMC BIOINFORMATICS

出版社

BMC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文