Journal
APPLIED SOFT COMPUTING
Volume 54, Issue -, Pages 23-45Publisher
ELSEVIER
DOI: 10.1016/j.asoc.2017.01.011
Keywords
Multiple comparison; Friedman test; Nemenyi test; CRS4EAs
Ask authors/readers for more resources
When conducting a comparison between multiple algorithms on multiple optimisation problems it is expected that the number of algorithms, problems and even the number of independent runs will affect the final conclusions. Our question in this research was to what extent do these three factors affect the conclusions of standard Null Hypothesis Significance Testing (NHST) and the conclusions of our novel method for comparison and ranking the Chess Rating System for Evolutionary Algorithms (CRS4EAs). An extensive experiment was conducted and the results were gathered and saved of k = 16 algorithms on N = 40 optimisation problems over n = 100 runs. These results were then analysed in a way that shows how these three values affect the final results, how they affect ranking and which values provide unreliable results. The influence of the number of algorithms was examined for values k = {4, 8, 12, 16}, number of problems for values N = {5, 10, 20, 40}, and number of independent runs for values n = {10, 30, 50, 100}. We were also interested in the comparison between both methods - NHST's Friedman test with post-hoc Nemenyi test and CRS4EAs - to see if one of them has advantages over the other. Whilst the conclusions after analysing the values of k were pretty similar, this research showed that the wrong value of N can give unreliable results when analysing with the Friedman test. The Friedman test does not detect any or detects only a small number of significant differences for small values of N and the CRS4EAs does not have a problem with that. We have also shown that CRS4EAs is an appropriate method when only a small number of independent runs n are available. (C) 2017 Elsevier B.V. All rights reserved.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available