期刊
GECCO'20: PROCEEDINGS OF THE 2020 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE
卷 -, 期 -, 页码 1003-1011出版社
ASSOC COMPUTING MACHINERY
DOI: 10.1145/3377930.3390244
关键词
Genetic Programming; Symbolic Regression; Generalization; Representation; Spline
Genetic Programming for Symbolic Regression is often prone to overfit the training data, resulting in poor generalization on unseen data. To address this issue, many pieces of research have been devoted to regularization via controlling the model complexity. However, due to the unstructured tree based representation of individuals the model complexity cannot be directly computed, rather approximation of the complexity must be taken. This paper proposes a new novel representation called Adaptive Weighted Splines which enables explicit control over the complexity of individuals using splines. The experimental results confirm that this new representation is significantly better than the tree-based representation at avoiding overfitting and generalizing on unseen data, demonstrating notably better and far more consistent generalization performances on all the benchmark problems. Further analysis also shows that in most cases, the new Genetic Programming method outperforms classical regression techniques such as Linear Regression, Support Vector Regression, K-Nearest Neighbour and Decision Tree Regression and performs competitively with state-of-the-art ensemble regression methods Random Forests and Gradient Boosting.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据