4.5 Article

Statistical approach for automated weighting of datasets: Application to heat capacity data

Publisher

PERGAMON-ELSEVIER SCIENCE LTD
DOI: 10.1016/j.calphad.2020.101994

Keywords

Weighting; K-fold Cross-Validation; Heat capacity; CALPHAD

Funding

  1. German Research Foundation (DFG) [SFB TR-103]
  2. NASA Space Technology Research Fellowship, USA [80NSSC18K116]
  3. IMPRS-SurMat, Germany

Ask authors/readers for more resources

An essential step in CALPHAD is assigning relative weights to different datasets, but there is no consensus as to the best approach regarding this issue. Currently, such an assignment of weights for experimental or first-principles data is performed manually based on the knowledge and experience of the modeler. Since the existing manual treatment is subjective and time consuming, manipulation of such data is rapidly advancing toward automated procedures through statistical and data mining tools. In the present study, we propose an automated approach to determine the weight of datasets based on the K-Fold Cross-Validation method, modified under the conditions that each fold is selected non-randomly and contains an unequal number of observations. This approach can be considered for researchers as a support tool to evaluate the reliability of each dataset involved in the CALPHAD modeling and quantify the impact of weighting by statistical analysis of the corresponding model. We demonstrate the efficacy of this method through the evaluation of heat capacity data of fcc nickel, hcp magnesium, and bcc iron.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available