4.2 Article

Performance of a Chest Radiograph AI Diagnostic Tool for COVID-19: A Prospective Observational Study

Journal

RADIOLOGY-ARTIFICIAL INTELLIGENCE
Volume 4, Issue 4, Pages -

Publisher

RADIOLOGICAL SOC NORTH AMERICA (RSNA)
DOI: 10.1148/ryai.210217

Keywords

Diagnosis; Classification; Application Domain; Infection; Lung

Funding

  1. Agency for Healthcare Research and Quality (AHRQ) [K12HS026379]
  2. Patient-Centered Outcomes Research Institute (PCORI) [K12HS026379]
  3. National Institutes of Health (NIH) National Center for Advancing Translational Sciences [KL2TR002492, UL1TR002494]
  4. NIH National Heart, Lung, and Blood Institute [T32HL07741]
  5. NIH National Institute of Biomedical Imaging and Bioengineering [75N92020D00018/75N92020F00001]
  6. National Institute of Biomedical Imaging and Bioengineering MIDRC grant of the National Institutes of Health [75N92020C00008, 75N92020C00021]
  7. U.S. National Science Foundation from the Division of Electrical, Communication and Cyber Systems [1928481]
  8. University of Minnesota Office of the Vice President of Research (OVPR) COVID-19 Impact Grant
  9. Directorate For Engineering
  10. Div Of Electrical, Commun & Cyber Sys [1928481] Funding Source: National Science Foundation

Ask authors/readers for more resources

A prospective observational study was conducted across 12 U.S. hospitals to evaluate the real-time performance of an interpretable AI model in detecting COVID-19 on chest radiographs. The study found that the accuracy of the model still has room for improvement compared to radiologist predictions.
Purpose: To conduct a prospective observational study across 12 U.S. hospitals to evaluate real-time performance of an interpretable artificial intelligence (AI) model to detect COVID-19 on chest radiographs. Materials and Methods: A total of 95363 chest radiographs were included in model training, external validation, and real-time validation. The model was deployed as a clinical decision support system, and performance was prospectively evaluated. There were 5335 total real-time predictions and a COVID-19 prevalence of 4.8% (258 of 5335). Model performance was assessed with use of receiver operating characteristic analysis, precision-recall curves, and F1 score. Logistic regression was used to evaluate the association of race and sex with AI model diagnostic accuracy. To compare model accuracy with the performance of board-certified radiologists, a third dataset of 1638 images was read independently by two radiologists. Results: Participants positive for COVID-19 had higher COVID-19 diagnostic scores than participants negative for COVID-19 (median, 0.1 [IQR, 0.0-0.8] vs 0.0 [IQR, 0.0-0.1], respectively; P<.001). Real-time model performance was unchanged over 19 weeks of implementation (area under the receiver operating characteristic curve, 0.70; 95% CI: 0.66, 0.73). Model sensitivity was higher in men than women (P =.01), whereas model specificity was higher in women (P =.001). Sensitivity was higher for Asian (P =.002) and Black (P =.046) participants compared with White participants. The COVID-19 AI diagnostic system had worse accuracy (63.5% correct) compared with radiologist predictions (radiologist 1 = 67.8% correct, radiologist 2 = 68.6% correct; McNemar P<.001 for both). Conclusion: AI-based tools have not yet reached full diagnostic potential for COVID-19 and underperform compared with radiologist prediction. (C) RSNA, 2022

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.2
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available