4.7 Article

Machine learning hybrid approach for the prediction of surface tension profiles of hydrocarbon surfactants in aqueous solution

Journal

JOURNAL OF COLLOID AND INTERFACE SCIENCE
Volume 625, Issue -, Pages 328-339

Publisher

ACADEMIC PRESS INC ELSEVIER SCIENCE
DOI: 10.1016/j.jcis.2022.06.034

Keywords

Machine learning; Surfactant; Surface tension; QSPR; Critical micelle concentration

Funding

  1. EPSRC
  2. PG [18000140]
  3. Royal Academy of Engineering (UK)

Ask authors/readers for more resources

This study uses machine learning to relate molecular descriptors to the surface tension profiles of hydrocarbon surfactants. Experimental data is fitted to extract characteristic parameters, and a gradient-boosted regressor model is used for feature selection. The results show a good correlation between the machine learning models and experimental data.
Hypothesis: Predicting the surface tension (SFT)-log(c) profiles of hydrocarbon surfactants in aqueous solution is computationally non-trivial, and empirically challenging due to the diverse and complex architecture and interactions of surfactant molecules. Machine learning (ML), combining a data-based and knowledge-based approach, can provide a powerful means to relate molecular descriptors to SFT profiles. Experiments: A dataset of SFT for 154 model hydrocarbon surfactants at 20-30 degrees C is fitted to the Szyszkowski equation to extract three characteristic parameters (gamma(max),K-L and critical micelle concentration (CMC)) which are correlated to a series of 2D and 3D molecular descriptors. Key (similar to 10) descriptors were selected by removing co-correlation, and employing a gradient-boosted regressor model to rank feature importance and carry out recursive feature elimination (RFE). The hyperparameters of each target-variable model were fine-tuned using a randomised cross-validated grid search, to improve predictive ability and reduce overfitting. Findings: The ML models correlate favourably with test experimental data, with R-2= 0.69-0.87, and the merits and limitations of the approach are discussed based on 'unseen' hydrocarbon surfactants. The incorporation of a knowledge-based framework provides an appropriate smoothing of the experimental data which simplifies the data-driven approach and enhances its generality. Open-source codes and a brief tutorial are provided. (C) 2022 The Authors. Published by Elsevier Inc.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available