4.7 Article

Self-paced ensemble for constructing an efficient robust high-performance classification model for detecting mineralization anomalies from geochemical exploration data

Journal

ORE GEOLOGY REVIEWS
Volume 157, Issue -, Pages -

Publisher

ELSEVIER
DOI: 10.1016/j.oregeorev.2023.105418

Keywords

Self -paced ensemble; Self -training; Classification; Precision -recall curve; Area under the precision -recall curve; Mineralization anomaly

Ask authors/readers for more resources

The self-paced ensemble algorithm is a more efficient and robust approach for detecting mineralization anomalies in geochemical exploration data compared to the self-training algorithm. Through a case study in Inner Mongolia, it is shown that the self-paced ensemble algorithm outperforms the self-training algorithm in terms of classification performance, robustness, and efficiency.
Given a base classifier such as the support vector classifier, the self-training algorithm can be used to build a high-performance classification model to detect mineralization anomalies from geochemical exploration data. However, the established classification model has poor robustness. To solve this problem, the self-paced ensemble algorithm was adopted to establish an efficient robust high-performance model for detecting miner-alization geochemical anomalies. The self-paced ensemble algorithm can efficiently build a robust classification model based on a base classifier such as the support vector classifier, decision tree classifier, k-nearest neighbor classifier, gradient boosting classifier and multilayer perceptron. To illustrate the superiority of the self-paced ensemble algorithm, a case study for molybdenum mineralization anomaly detection was carried out in the Molidawa area, Inner Mongolia, China. The self-paced ensemble algorithm and self-training algorithm were used to build classification models based on decision tree classifier to detect molybdenum mineralization anomalies from stream sediment survey data. Each algorithm was repeated five times, each time using the same set of parameters to initialize the algorithm. Thus, five classification models were established for each algorithm. The precision-recall curve (PRC) and area under the precision-recall curve (AUPRC) were used to evaluate the per-formance of the classification models in geochemical exploration. The results show that compared with the PRCs of the five classification models established by the self-training algorithm, the PRCs of those established by the self-paced ensemble algorithm coincide with each other and are closer to the upper right corner of the precision -recall space. The AUPRCs of the five classification models established by the self-paced ensemble algorithm are all 0.3538, and the AUPRCs of those established by the self-training algorithm are between 0.006624 and 0.02941, much lower than the value 0.3538. In addition, the time of the self-paced ensemble algorithm for geochemical data modeling ranges from 44.84 to 47.05 s, and the time of the self-training algorithm for geochemical data modeling ranges from 51.97 to 59.46 s. Therefore, the data-modeling efficiency, robustness and classification performance of the model established by the self-paced ensemble algorithm are better than those of the model established by the self-training algorithm in detecting molybdenum mineralization anomalies. It can be concluded that the self-paced ensemble algorithm is one of effective tools to establish an efficient robust high-performance classification model for mineralization anomaly detection.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available