4.7 Article

Capturing non-local interactions by long short-term memory bidirectional recurrent neural networks for improving prediction of protein secondary structure, backbone angles, contact numbers and solvent accessibility

期刊

BIOINFORMATICS
卷 33, 期 18, 页码 2842-2849

出版社

OXFORD UNIV PRESS
DOI: 10.1093/bioinformatics/btx218

关键词

-

资金

  1. National Health and Medical Research Council of Australia [1059775, 1083450]
  2. Australian Research Council [LE150100161]
  3. Australian Research Council [LE150100161] Funding Source: Australian Research Council

向作者/读者索取更多资源

Motivation: The accuracy of predicting protein local and global structural properties such as secondary structure and solvent accessible surface area has been stagnant for many years because of the challenge of accounting for non-local interactions between amino acid residues that are close in three-dimensional structural space but far from each other in their sequence positions. All existing machine-learning techniques relied on a sliding window of 10-20 amino acid residues to capture some 'short to intermediate' non-local interactions. Here, we employed Long Short-Term Memory (LSTM) Bidirectional Recurrent Neural Networks (BRNNs) which are capable of capturing long range interactions without using a window. Results: We showed that the application of LSTM-BRNN to the prediction of protein structural properties makes the most significant improvement for residues with the most long-range contacts (vertical bar i-j vertical bar>19) over a previous window-based, deep-learning method SPIDER2. Capturing long-range interactions allows the accuracy of three-state secondary structure prediction to reach 84% and the correlation coefficient between predicted and actual solvent accessible surface areas to reach 0.80, plus a reduction of 5%, 10%, 5% and 10% in the mean absolute error for backbone phi, psi, theta and tau angles, respectively, from SPIDER2. More significantly, 27% of 182724 40-residue models directly constructed from predicted C alpha atom-based theta and tau have similar structures to their corresponding native structures (6 angstrom RMSD or less), which is 3% better than models built by phi and psi angles. We expect the method to be useful for assisting protein structure and function prediction. Availability and implementation: The method is available as a SPIDER3 server and standalone package at http://sparks-lab.org. Contact: yaoqi.zhou@griffith.edu. au or yuedong.yang@griffith.edu.au Supplementary information: Supplementary data are available at Bioinformatics online.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据