期刊
CHEMICAL SCIENCE
卷 12, 期 42, 页码 14174-14181出版社
ROYAL SOC CHEMISTRY
DOI: 10.1039/d1sc01839f
关键词
-
资金
- Bayer AG Life Science Collaboration (DeepMinds)
- Bayer AG Life Science Collaboration (Explainable AI)
- Bayer AG's PhD scholarships
- European Commission under the Horizon2020 Framework Program for Research and Innovation [963845, 956832]
- Marie Curie Actions (MSCA) [956832] Funding Source: Marie Curie Actions (MSCA)
The paper introduces a model that combines deep convolutional neural network learning and a pre-trained decoder to accurately translate molecular images into SMILES representation. Evaluation shows that the model can correctly translate up to 88% of molecular images.
The automatic recognition of the molecular content of a molecule's graphical depiction is an extremely challenging problem that remains largely unsolved despite decades of research. Recent advances in neural machine translation enable the auto-encoding of molecular structures in a continuous vector space of fixed size (latent representation) with low reconstruction errors. In this paper, we present a fast and accurate model combining deep convolutional neural network learning from molecule depictions and a pre-trained decoder that translates the latent representation into the SMILES representation of the molecules. This combination allows us to precisely infer a molecular structure from an image. Our rigorous evaluation shows that Img2Mol is able to correctly translate up to 88% of the molecular depictions into their SMILES representation. A pretrained version of Img2Mol is made publicly available on GitHub for non-commercial users.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据