4.6 Article

Application of a data-driven XGBoost model for the prediction of COVID-19 in the USA: a time-series study

Journal

BMJ OPEN
Volume 12, Issue 7, Pages -

Publisher

BMJ PUBLISHING GROUP
DOI: 10.1136/bmjopen-2021-056685

Keywords

COVID-19; epidemiology

Funding

  1. National Natural Science Foundation of China [81202254]
  2. Health and Medical Big Data Research Project of China Medical University [HMB201903105]
  3. Science Foundation of Liaoning Provincial Department of Education [LJKQZ2021027]

Ask authors/readers for more resources

This study compared the accuracy of the ARIMA model and the XGBoost model in predicting the occurrence of COVID-19 in the USA. The results showed that the XGBoost model had lower MAE, RMSE, and MAPE values than the ARIMA model.
Objective The COVID-19 outbreak was first reported in Wuhan, China, and has been acknowledged as a pandemic due to its rapid spread worldwide. Predicting the trend of COVID-19 is of great significance for its prevention. A comparison between the autoregressive integrated moving average (ARIMA) model and the eXtreme Gradient Boosting (XGBoost) model was conducted to determine which was more accurate for anticipating the occurrence of COVID-19 in the USA. Design Time-series study. Setting The USA was the setting for this study. Main outcome measures Three accuracy metrics, mean absolute error (MAE), root mean square error (RMSE) and mean absolute percentage error (MAPE), were applied to evaluate the performance of the two models. Results In our study, for the training set and the validation set, the MAE, RMSE and MAPE of the XGBoost model were less than those of the ARIMA model. Conclusions The XGBoost model can help improve prediction of COVID-19 cases in the USA over the ARIMA model.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available