4.7 Article

Learning to predict soccer results from relational data with gradient boosted trees

Journal

MACHINE LEARNING
Volume 108, Issue 1, Pages 29-47

Publisher

SPRINGER
DOI: 10.1007/s10994-018-5704-6

Keywords

Prediction challenge; Relational data; Soccer; Gradient boosted trees; Relational dependency networks; Sports; Forecasting

Funding

  1. Czech Science Foundation [17-26999S]
  2. CESNET [LM2015042]
  3. CERIT Scientific Cloud [LM2015085]

Ask authors/readers for more resources

We describe our winning solution to the 2017's Soccer Prediction Challenge organized in conjunction with the MLJ's special issue on Machine Learning for Soccer. The goal of the challenge was to predict outcomes of future matches within a selected time-frame from different leagues over the world. A dataset of over 200,000 past match outcomes was provided to the contestants. We experimented with both relational and feature-based methods to learn predictive models from the provided data. We employed relevant latent variables computable from the data, namely so called pi-ratings and also a rating based on the PageRank method. A method based on manually constructed features and the gradient boosted tree algorithm performed best on both the validation set and the challenge test set. We also discuss the validity of the assumption that probability predictions on the three ordinal match outcomes should be monotone, underlying the RPS measure of prediction quality.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available