4.7 Article

Intelligent modelling of clay compressibility using hybrid meta-heuristic and machine learning algorithms

Journal

GEOSCIENCE FRONTIERS
Volume 12, Issue 1, Pages 441-452

Publisher

CHINA UNIV GEOSCIENCES, BEIJING
DOI: 10.1016/j.gsf.2020.02.014

Keywords

Compressibility; Clays; Machine learning; Optimization; Random forest; Genetic algorithm

Funding

  1. RIF project from the Research Grants Council (RGC) of Hong Kong [PolyU R5037-18F]

Ask authors/readers for more resources

This study proposes a novel modeling approach using machine learning techniques to predict the compression index C c in geotechnical design, showing that machine learning models outperform traditional empirical prediction formulations. Among the tested machine learning algorithms, random forest and back-propagation neural network models are recommended for predicting C c under different conditions.
Compression index C c is an essential parameter in geotechnical design for which the effectiveness of correlation is still a challenge. This paper suggests a novel modelling approach using machine learning (ML) technique. The performance of five commonly used machine learning (ML) algorithms, i.e. back-propagation neural network (BPNN), extreme learning machine (ELM), support vector machine (SVM), random forest (RF) and evolutionary polynomial regression (EPR) in predicting C-c is comprehensively investigated. A database with a total number of 311 datasets including three input variables, i.e. initial void ratio e(0), liquid limit water content W-L, plasticity index I-p, and one output variable C-c is first established. Genetic algorithm (GA) is used to optimize the hyper-parameters in five ML algorithms, and the average prediction error for the 10-fold cross-validation (CV) sets is set as the fitness function in the GA for enhancing the robustness of ML models. The results indicate that ML models outperform empirical prediction formulations with lower prediction error. RF yields the lowest error followed by BPNN, ELM, EPR and SVM. If the ranges of input variables in the database are large enough, BPNN and RF models are recommended to predict C-c. Furthermore, if the distribution of input variables is continuous, RF model is the best one. Otherwise, EPR model is recommended if the ranges of input variables are small. The predicted correlations between input and output variables using five ML models show great agreement with the physical explanation.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available