4.7 Article

Mi3-GPU: MCMC-based inverse Ising inference on GPUs for protein covariation analysis

期刊

COMPUTER PHYSICS COMMUNICATIONS
卷 260, 期 -, 页码 -

出版社

ELSEVIER
DOI: 10.1016/j.cpc.2020.107312

关键词

Protein evolution; Covariation analysis; GPU computing; Ising model; Monte Carlo

资金

  1. National Institutes of Health [U54-GM133068, R35-GM132090]
  2. National Science Foundation [193484, 1625061]
  3. US Army Research Laboratory [W911NF-16-2-0189]

向作者/读者索取更多资源

Inverse Ising inference is a method used in protein physics to infer coupling parameters of a Potts/Ising model based on observed site-covariation. The Mi3-GPU software, utilizing GPU-accelerated Markov-Chain Monte Carlo sampling, solves the inverse Ising problem for protein-sequence datasets, generating models that accurately replicate observed MSA covariation patterns.
Inverse Ising inference is a method for inferring the coupling parameters of a Potts/Ising model based on observed site-covariation, which has found important applications in protein physics for detecting interactions between residues in protein families. We introduce Mi3-GPU (mee-three, for MCMC Inverse Ising Inference) software for solving the inverse Ising problem for protein-sequence datasets with few analytic approximations, by parallel Markov-Chain Monte Carlo sampling on GPUs. We also provide tools for analysis and preparation of protein-family Multiple Sequence Alignments (MSAs) to account for finite-sampling issues, which are a major source of error or bias in inverse Ising inference. Our method is generativein the sense that the inferred model can be used to generate synthetic MSAs whose mutational statistics (marginals) can be verified to match the dataset MSA statistics up to the limits imposed by the effects of finite sampling. Our GPU implementation enables the construction of models which reproduce the covariation patterns of the observed MSA with a precision that is not possible with more approximate methods. The main components of our method are a GPU-optimized algorithm to greatly accelerate MCMC sampling, combined with a multi-step Quasi-Newton parameter update scheme using a Zwanzig reweightingtechnique. We demonstrate the ability of this software to produce generative models on typical protein family datasets for sequence lengths L similar to 300 with 21 residue types with tens of millions of inferred parameters in short running times. Program summary Program Title: Mi3-GPU Program Files doi: http://dx.doi.org/10.17632/ftbcfy2p35.1 Licensing provisions: GPLv3 Programming languages: Python3, OpenCL, C Nature of problem: Mi3-GPU solves the inverse Ising problem for application in protein covariation analysis. The goal is to infer coupling'' parameters between positions in a Multiple Sequence Alignment of a protein family, with many applications including protein-contact prediction and fitness prediction. Solution method: Mi3-GPU solves the inverse Ising problem with few approximations using MarkovChain Monte Carlo methods with Quasi-Newton optimization on GPUs. This problem previously has been approached by more approximate methods using analytic approximations including message Passing'', Susceptibility Propagation, mean-fieldmethods, pseudolikelihood approximations, and cluster expansion. The software leverages GPU to accelerate MCMC sampling and a histogram reweighting technique to accelerate parameter optimization. (C) 2020 Elsevier B.V. All rights reserved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据