4.7 Article

In Silico Enzymatic Synthesis of a 400 000 Compound Biochemical Database for Nontargeted Metabolomics

Journal

JOURNAL OF CHEMICAL INFORMATION AND MODELING
Volume 53, Issue 9, Pages 2483-2492

Publisher

AMER CHEMICAL SOC
DOI: 10.1021/ci400368v

Keywords

-

Funding

  1. NIH [1R01GM087714]
  2. Agriculture and Food Research Initiative Competitive Grant from USDA National Institute of Food and Agriculture [2011-67016-30331]
  3. NSF [IIS-0916948]
  4. Direct For Computer & Info Scie & Enginr
  5. Div Of Information & Intelligent Systems [0916948] Funding Source: National Science Foundation
  6. NIFA [579659, 2011-67016-30331] Funding Source: Federal RePORTER

Ask authors/readers for more resources

Current methods of structure identification in mass-spectrometry-based nontargeted metabolomics rely on matching experimentally determined features of an unknown compound to those of candidate compounds contained in biochemical databases. A major limitation of this approach is the relatively small number of compounds currently included in these databases. If the correct structure is not present in a database, it cannot be identified, and if it cannot be identified, it cannot be included in a database. Thus, there is an urgent need to augment metabolomics databases with rationally designed biochemical structures using alternative means. Here we present the In Vivo/In Silico Metabolites Database (IIMDB), a database of in silico enzymatically synthesized metabolites, to partially address this problem. The database, which is available at http://metabolomics.pharm.uconn.edu/iimdb/, includes similar to 23 000 known compounds (mammalian metabolites, drugs, secondary,plant metabolites, and glycerophospholipids) collected from existing biochemical databases plus more than 400 000 computationally generated human phase-I and phase-II metabolites of these known compounds. IIMDB features a user-friendly, web interface and a programmer-friendly RESTful web service. Ninety-five percent of the computationally generated metabolites in IIMDB were not found in any existing database. However, 21 640 were identical to compounds already listed in PubChem, HMDB, KEGG, or HumanCyc. Furthermore, the vast majority of these in silico metabolites were scored as biological using BioSM, a software program that identifies biochemical structures in chemical structure space. These results suggest that in silico biochemical synthesis represents a viable approach for significantly augmenting biochemical databases for nontargeted metabolomics applications.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available