4.7 Article

In silico prediction of fragrance retention grades for monomer flavors using QSPR models

出版社

ELSEVIER
DOI: 10.1016/j.chemolab.2021.104424

关键词

QSPR models; Molecular descriptors; Applicability domain; Machine learning; In silico prediction

向作者/读者索取更多资源

This study developed a predictive model for estimating the FRGs of monomer flavors using machine learning algorithms, with a focus on the significant impact of functional groups such as SH (thiols), ArOR (ethers), and ArCOOR (esters) on FRGs. The applicability domains (AD) were defined and external data was used to validate the model reliability, providing a reliable research result for the perfumer industry.
The fragrance retention grades (FRGs) of monomer flavors contribute significantly to the perfumer technology development. In silico prediction of FRGs of monomer flavors are required to reduce costs, time, and manual testing. Quantitative structure-property relationships (QSPR) were established employing a database of monomer flavors, including 1552 odorants and corresponding FRGs. Molecular structure physicochemical information of the odorant molecules was acquired using a molecular calculation software (Dragon 7.0). To obviate the challenge of high dimensionality, we employed five feature extractors, including principal component analysis, lasso, recursive feature elimination, autoencoder, and boruta algorithm. Moreover, three machine learning algorithms were applied and compared to develop QSPR models for the estimation of the FRGs of monomer flavors. The selected machine learning algorithms were random forest, support vector machine, and deep neural network. We developed a weighted scoring formula for calculating the correlation score and association analysis between functional groups and FRGs. The results demonstrated that SH (thiols), ArOR (ethers), and ArCOOR (esters) functional groups have significant impact on the FRGs. In addition, we defined the applicability domains (AD) to limit the scope of application of the test dataset and used external data to validate the model reliability. Finally, we performed a comparative analysis using recursive feature elimination to extract the 80-dimensional molecular descriptors (MDs). It was concluded that the random forest algorithm performed better, with an accuracy of 77.81%, precision of 77.83%, recall of 77.99%, and F1-score of 77.88%. The proposed in silico predictive QSPR model is likely to be considered reliable for evaluating the FRGs of monomer flavors and being promoted to the perfumer industry.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据