4.7 Review

Public Covid-19 X-ray datasets and their impact on model bias-A systematic review of a significant problem

Journal

MEDICAL IMAGE ANALYSIS
Volume 74, Issue -, Pages -

Publisher

ELSEVIER
DOI: 10.1016/j.media.2021.102225

Keywords

COVID-19; Machine learning; Datasets; X-Ray; Imaging; Review; Bias; Confounding

Funding

  1. Luxembourg National Research Fund (FNR) [COVID-19/2020-1/14702831/AICovIX/Husch]
  2. FNR within the PARK-QC DTU [PRIDE17/12244779/PARK-QC]
  3. Fondation Cancer Luxembourg

Ask authors/readers for more resources

The study systematically evaluated computer-aided diagnosis and stratification of COVID-19 based on chest X-ray, highlighting issues with bias assessment and quality control of datasets. Only a small number of datasets met criteria for proper risk bias assessment, with most datasets used in peer-reviewed papers having a high risk of bias.
Computer-aided-diagnosis and stratification of COVID-19 based on chest X-ray suffers from weak bias assessment and limited quality-control. Undetected bias induced by inappropriate use of datasets, and improper consideration of confounders prevents the translation of prediction models into clinical practice. By adopting established tools for model evaluation to the task of evaluating datasets, this study provides a systematic appraisal of publicly available COVID-19 chest X-ray datasets, determining their potential use and evaluating potential sources of bias. Only 9 out of more than a hundred identified datasets met at least the criteria for proper assessment of risk of bias and could be analysed in detail. Remarkably most of the datasets utilised in 201 papers published in peer-reviewed journals, are not among these 9 datasets, thus leading to models with high risk of bias. This raises concerns about the suitability of such models for clinical use. This systematic review highlights the limited description of datasets employed for modelling and aids researchers to select the most suitable datasets for their task. (c) 2021 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license ( http://creativecommons.org/licenses/by-nc-nd/4.0/ )

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available