4.7 Article

An empirical overview of nonlinearity and overfitting in machine learning using COVID-19 data

Journal

CHAOS SOLITONS & FRACTALS
Volume 139, Issue -, Pages -

Publisher

PERGAMON-ELSEVIER SCIENCE LTD
DOI: 10.1016/j.chaos.2020.110055

Keywords

Bias-variance dilemma; Time series prediction; Support vector machine; Statistical learning; Hyperparameters and chaos; Epidemic spreading

Ask authors/readers for more resources

In this paper, we applied support vector regression to predict the number of COVID-19 cases for the 12 most-affected countries, testing for different structures of nonlinearity using Kernel functions and analyzing the sensitivity of the models' predictive performance to different hyperparameters settings using 3-D interpolated surfaces. In our experiment, the model that incorporates the highest degree of nonlinearity (Gaussian Kernel) had the best in-sample performance, but also yielded the worst out-of-sample predictions, a typical example of overfitting in a machine learning model. On the other hand, the linear Kernel function performed badly in-sample but generated the best out-of-sample forecasts. The findings of this paper provide an empirical assessment of fundamental concepts in data analysis and evidence the need for caution when applying machine learning models to support real-world decision making, notably with respect to the challenges arising from the COVID-19 pandemics. (C) 2020 Elsevier Ltd. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available