4.6 Article

Collision Cross Section Prediction with Molecular Fingerprint Using Machine Learning

期刊

MOLECULES
卷 27, 期 19, 页码 -

出版社

MDPI
DOI: 10.3390/molecules27196424

关键词

collision cross section; ion mobility spectrometry; non-target screening; machine learning

资金

  1. SETASAR PhD Project from Region Nouvelle Aquitaine
  2. E2S UPPA
  3. LPL

向作者/读者索取更多资源

High-resolution mass spectrometry combined with ion mobility spectrometry has been widely used in non-target screening, improving the accuracy of small molecule identification. Two CCS prediction models were developed using machine learning algorithm, successfully applied on different IMS platforms to eliminate false positives in small molecule identification.
High-resolution mass spectrometry is a promising technique in non-target screening (NTS) to monitor contaminants of emerging concern in complex samples. Current chemical identification strategies in NTS experiments typically depend on spectral libraries, chemical databases, and in silico fragmentation tools. However, small molecule identification remains challenging due to the lack of orthogonal sources of information (e.g., unique fragments). Collision cross section (CCS) values measured by ion mobility spectrometry (IMS) offer an additional identification dimension to increase the confidence level. Thanks to the advances in analytical instrumentation, an increasing application of IMS hybrid with high-resolution mass spectrometry (HRMS) in NTS has been reported in the recent decades. Several CCS prediction tools have been developed. However, limited CCS prediction methods were based on a large scale of chemical classes and cross-platform CCS measurements. We successfully developed two prediction models using a random forest machine learning algorithm. One of the approaches was based on chemicals' super classes; the other model was direct CCS prediction using molecular fingerprint. Over 13,324 CCS values from six different laboratories and PubChem using a variety of ion-mobility separation techniques were used for training and testing the models. The test accuracy for all the prediction models was over 0.85, and the median of relative residual was around 2.2%. The models can be applied to different IMS platforms to eliminate false positives in small molecule identification.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据