Journal
BIOORGANIC & MEDICINAL CHEMISTRY
Volume 66, Issue -, Pages -Publisher
PERGAMON-ELSEVIER SCIENCE LTD
DOI: 10.1016/j.bmc.2022.116808
Keywords
Analogue series; Structure -activity relationships; Analogue design; Deep learning; Natural language processing; Chemical language models
Ask authors/readers for more resources
In this study, a chemical language model based on deep learning is introduced for analogue design. The model predicts preferred R-groups for new analogues based on ordered R-group sequences, taking into account the potency gradient and detectable SAR trends, providing a new concept for analogue design.
In medicinal chemistry, hit-to-lead and lead optimization efforts produce analogue series (ASs), the analysis of which is of central relevance for the exploration and exploitation of structure-activity relationships (SARs) and generation of candidate compounds. The key question in any chemical optimization effort is which analogue(s) to generate next, for which computational support is typically provided through QSAR analysis and compound potency predictions. In this study, we introduce a new chemical language model for analogue design via deep learning. For this purpose, ASs comprising active compounds are ordered according to increasing potency and the chemical language model predicts preferred R-groups for new analogues on the basis of ordered R-group sequences. Hence, consistent with the principles of deep models for natural language processing, analogues with new R-groups are predicted based upon conditional probabilities taking preceding groups into account. This implicitly accounts for the potency gradient captured by an AS and detectable SAR trends, providing a new concept for analogue design. Herein, we report the AS-based chemical language model, its initial evaluation, and exemplary applications.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available