4.5 Article

Data Base similarity (DBsimilarity) of natural products to aid compound identification on MS and NMR pipelines, similarity networking, and more

期刊

PHYTOCHEMICAL ANALYSIS
卷 -, 期 -, 页码 -

出版社

WILEY
DOI: 10.1002/pca.3277

关键词

chemoinformatics; dereplication; metabolomics; natural products; structure similarity

向作者/读者索取更多资源

We developed a tool called DBsimilarity that organizes structure databases into similarity networks, making it easier for natural product chemists to visualize information. The tool converts SDF files into CSV files, calculates compound similarities, and constructs similarity networks in CSV format. Using DBsimilarity, several potential antibiotic compounds were identified in Ginkgo biloba compounds, along with other compounds worthy of further investigation.
IntroductionWe developed Data Base similarity (DBsimilarity), a user-friendly tool designed to organize structure databases into similarity networks, with the goal of facilitating the visualization of information primarily for natural product chemists who may not have coding experience. MethodDBsimilarity, written in Jupyter Notebooks, converts Structure Data File (SDF) files into Comma-Separated Values (CSV) files, adds chemoinformatics data, constructs an MZMine custom database file and an NMRfilter candidate list of compounds for rapid dereplication of MS and 2D NMR data, calculates similarities between compounds, and constructs CSV files formatted into similarity networks for Cytoscape. ResultsThe Lotus database was used as a source for Ginkgo biloba compounds, and DBsimilarity was used to create similarity networks including NPClassifier classification to indicate biosynthesis pathways. Subsequently, a database of validated antibiotics from natural products was combined with the G. biloba compounds to identify promising compounds. The presence of 11 compounds in both datasets points to possible antibiotic properties of G. biloba, and 122 compounds similar to these known antibiotics were highlighted. Next, DBsimilarity was used to filter the NPAtlas database (selecting only those with MIBiG reference) to identify potential antibacterial compounds using the ChEMBL database as a reference. It was possible to promptly identify five compounds found in both databases and 167 others worthy of further investigation. ConclusionChemical and biological properties are determined by molecular structures. DBsimilarity enables the creation of interactive similarity networks using Cytoscape. It is also in line with a recent review that highlights poor biological plausibility and unrealistic chromatographic behaviors as significant sources of errors in compound identification.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据