4.7 Article

In silico prediction of fragrance retention grades for monomer flavors using QSPR models

Journal

Publisher

ELSEVIER
DOI: 10.1016/j.chemolab.2021.104424

Keywords

QSPR models; Molecular descriptors; Applicability domain; Machine learning; In silico prediction

Ask authors/readers for more resources

This study developed a predictive model for estimating the FRGs of monomer flavors using machine learning algorithms, with a focus on the significant impact of functional groups such as SH (thiols), ArOR (ethers), and ArCOOR (esters) on FRGs. The applicability domains (AD) were defined and external data was used to validate the model reliability, providing a reliable research result for the perfumer industry.
The fragrance retention grades (FRGs) of monomer flavors contribute significantly to the perfumer technology development. In silico prediction of FRGs of monomer flavors are required to reduce costs, time, and manual testing. Quantitative structure-property relationships (QSPR) were established employing a database of monomer flavors, including 1552 odorants and corresponding FRGs. Molecular structure physicochemical information of the odorant molecules was acquired using a molecular calculation software (Dragon 7.0). To obviate the challenge of high dimensionality, we employed five feature extractors, including principal component analysis, lasso, recursive feature elimination, autoencoder, and boruta algorithm. Moreover, three machine learning algorithms were applied and compared to develop QSPR models for the estimation of the FRGs of monomer flavors. The selected machine learning algorithms were random forest, support vector machine, and deep neural network. We developed a weighted scoring formula for calculating the correlation score and association analysis between functional groups and FRGs. The results demonstrated that SH (thiols), ArOR (ethers), and ArCOOR (esters) functional groups have significant impact on the FRGs. In addition, we defined the applicability domains (AD) to limit the scope of application of the test dataset and used external data to validate the model reliability. Finally, we performed a comparative analysis using recursive feature elimination to extract the 80-dimensional molecular descriptors (MDs). It was concluded that the random forest algorithm performed better, with an accuracy of 77.81%, precision of 77.83%, recall of 77.99%, and F1-score of 77.88%. The proposed in silico predictive QSPR model is likely to be considered reliable for evaluating the FRGs of monomer flavors and being promoted to the perfumer industry.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available