4.6 Article

Kaggle forecasting competitions: An overlooked learning opportunity

期刊

INTERNATIONAL JOURNAL OF FORECASTING
卷 37, 期 2, 页码 587-603

出版社

ELSEVIER
DOI: 10.1016/j.ijforecast.2020.07.007

关键词

Time series methods; M competitions; Business forecasting; Forecast accuracy; Machine learning methods; Benchmarking Time series visualization; Forecasting competition review

资金

  1. Manufacturing Academy of Denmark (MADE) Digital [6151-0 0 0 06B]

向作者/读者索取更多资源

This study reviews the results of six forecasting competitions on the Kaggle platform, highlighting the advantages of ensemble models, gradient boosted decision trees, and neural networks. It also mentions the impact of using external information, validation strategies, and data characteristics on the choice of machine learning methods.
We review the results of six forecasting competitions based on the online data science platform Kaggle, which have been largely overlooked by the forecasting community. In contrast to the M competitions, the competitions reviewed in this study feature daily and weekly time series with exogenous variables, business hierarchy information, or both. Furthermore, the Kaggle data sets all exhibit higher entropy than the M3 and M4 competitions, and they are intermittent. In this review, we confirm the conclusion of the M4 competition that ensemble models using cross-learning tend to outperform local time series models and that gradient boosted decision trees and neural networks are strong forecast methods. Moreover, we present insights regarding the use of external information and validation strategies, and discuss the impacts of data characteristics on the choice of statistics or machine learning methods. Based on these insights, we construct nine ex-ante hypotheses for the outcome of the M5 competition to allow empirical validation of our findings. (C) 2020 International Institute of Forecasters. Published by Elsevier B.V. All rights reserved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据