4.7 Article Data Paper

67 million natural product-like compound database generated via molecular language processing

期刊

SCIENTIFIC DATA
卷 10, 期 1, 页码 -

出版社

NATURE PORTFOLIO
DOI: 10.1038/s41597-023-02207-x

关键词

-

向作者/读者索取更多资源

Natural products are a rich source of bioactive compounds for various applications. A database containing 67,064,204 natural product-like molecules was generated using a recurrent neural network, greatly expanding the library size compared to known natural products. This study demonstrates the potential of using deep generative models for high throughput in silico discovery of novel natural product chemical space.
Natural products are a rich resource of bioactive compounds for valuable applications across multiple fields such as food, agriculture, and medicine. For natural product discovery, high throughput in silico screening offers a cost-effective alternative to traditional resource-heavy assay-guided exploration of structurally novel chemical space. In this data descriptor, we report a characterized database of 67,064,204 natural product-like molecules generated using a recurrent neural network trained on known natural products, demonstrating a significant 165-fold expansion in library size over the approximately 400,000 known natural products. This study highlights the potential of using deep generative models to explore novel natural product chemical space for high throughput in silico discovery.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据