☆ 4.7 Article

How Well Can We Predict Mass Spectra from Structures? Benchmarking Competitive Fragmentation Modeling for Metabolite Identification on Untrained Tandem Mass Spectra

JOURNAL OF CHEMICAL INFORMATION AND MODELING (2022)

期刊

JOURNAL OF CHEMICAL INFORMATION AND MODELING

卷 -, 期 -, 页码 -

出版社

AMER CHEMICAL SOC

DOI: 10.1021/acs.jcim.2c00936

关键词

类别

Chemistry, Medicinal Chemistry, Multidisciplinary Computer Science, Information Systems Computer Science, Interdisciplinary Applications

资金

National Institutes of Health by NIH [U2C ES030158]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

CFM-ID is a machine learning tool for predicting MS/MS spectra of metabolites. Matching experimental collision energy with CFM-ID's predicted energy produced optimal results, especially for benzenoids on HCD-Orbitrap instruments. CFM-ID 4.0 could be useful as a supplementary tool in the broader context of identification workflows.

Competitive Fragmentation Modeling for Metabo-lite Identification (CFM-ID) is a machine learning tool to predict in silico tandem mass spectra (MS/MS) for known or suspected metabolites for which chemical reference standards are not available. As a machine learning tool, it relies on both an underlying statistical model and an explicit training set that encompasses experimental mass spectra for specific compounds. Such mass spectra depend on specific parameters such as collision energies, instrument types, and adducts which are accumulated in libraries. Yet, ultimately prediction tools that are meant to cover wide expanses of entities must be validated on cases that were not included in the initial training and testing sets. Hence, we here benchmarked the performance of CFM-ID 4.0 to correctly predict MS/MS spectra for spectra that were not included in the CFM-ID training set and for different mass spectrometry conditions. We used 609,456 experimental tandem spectra from the NIST20 mass spectral library that were newly added to the previous NIST17 library version. We found that CFM-ID???s highest energy prediction output would maximize the capacity for library generation. Matching the experimental collision energy with CFM-ID???s prediction energy produced the best results, even for HCD-Orbitrap instruments. For benzenoids, better MS/MS predictions were achieved than for heterocyclic compounds. However, when exploring CFM-ID???s performance on 8,305 compounds at 40 eV HCD-Orbitrap collision energy, 90% of the 20/80 split test compounds showed <700 MS/MS similarity score. Instead of a stand-alone tool, CFM-ID 4.0 might be useful to boost candidate structures in the greater context of identification workflows.

How Well Can We Predict Mass Spectra from Structures? Benchmarking Competitive Fragmentation Modeling for Metabolite Identification on Untrained Tandem Mass Spectra

期刊

JOURNAL OF CHEMICAL INFORMATION AND MODELING

出版社

AMER CHEMICAL SOC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

How Well Can We Predict Mass Spectra from Structures? Benchmarking Competitive Fragmentation Modeling for Metabolite Identification on Untrained Tandem Mass Spectra

期刊

JOURNAL OF CHEMICAL INFORMATION AND MODELING

出版社

AMER CHEMICAL SOC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文