4.8 Article

Predicting drug-protein interaction using quasi-visual question answering system

期刊

NATURE MACHINE INTELLIGENCE
卷 2, 期 2, 页码 134-140

出版社

NATURE PORTFOLIO
DOI: 10.1038/s42256-020-0152-y

关键词

-

资金

  1. National Key R&D Program of China [2018YFC0910500]
  2. GD Frontier and Key Tech Innovation Program [2018B010109006, 2019B020228001]
  3. National Natural Science Foundation of China [61772566, U1611261, 81801132, 81903540]
  4. programme for Guangdong Introducing Innovative and Entrepreneurial Teams [2016ZT06D211]

向作者/读者索取更多资源

Identifying novel drug-protein interactions is crucial for drug discovery. For this purpose, many machine learning-based methods have been developed based on drug descriptors and one-dimensional protein sequences. However, protein sequences cannot accurately reflect the interactions in three-dimensional space. However, direct input of three-dimensional structure is of low efficiency due to the sparse three-dimensional matrix, and is also prevented by the limited number of co-crystal structures available for training. Here we propose an end-to-end deep learning framework to predict the interactions by representing proteins with a two-dimensional distance map from monomer structures (Image) and drugs with molecular linear notation (String), following the visual question answering mode. For efficient training of the system, we introduce a dynamic attentive convolutional neural network to learn fixed-size representations from the variable-length distance maps and a self-attentional sequential model to automatically extract semantic features from the linear notations. Extensive experiments demonstrate that our model obtains competitive performance against state-of-the-art baselines on the directory of useful decoys, enhanced (DUD-E), human and BindingDB benchmark datasets. Further attention visualization provides biological interpretation to depict highlighted regions of both protein and drug molecules. When predicting the interaction of proteins with potential drugs, the protein can be encoded as its one-dimensional sequence or a three-dimensional structure, which can capture more relevant features of the protein, but also makes the task to predict the interactions harder. A new method predicts these interactions using a two-dimensional distance matrix representation of a protein, which can be processed like a two-dimensional image, striking a balance between the data being simple to process and rich in relevant structures.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.8
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据