3.8 Article

Detecting Spurious Correlations With Sanity Tests for Artificial Intelligence Guided Radiology Systems

Journal

FRONTIERS IN DIGITAL HEALTH
Volume 3, Issue -, Pages -

Publisher

FRONTIERS MEDIA SA
DOI: 10.3389/fdgth.2021.671015

Keywords

deep learning; computed tomography; bias; validation; spurious correlations; artificial intelligence

Funding

  1. National Institutes of Health/National Cancer Institute Cancer Center [P30 CA008748]
  2. NSF [1909696]

Ask authors/readers for more resources

Artificial intelligence has been successful in solving problems in machine perception. In radiology, AI systems are rapidly evolving and progress in guiding treatment decisions, diagnosing, localizing disease on medical images, and improving radiologists' efficiency. Conducting analytical validation and clinical validation studies are critical components to deploy AI in radiology.
Artificial intelligence (AI) has been successful at solving numerous problems in machine perception. In radiology, AI systems are rapidly evolving and show progress in guiding treatment decisions, diagnosing, localizing disease on medical images, and improving radiologists' efficiency. A critical component to deploying AI in radiology is to gain confidence in a developed system's efficacy and safety. The current gold standard approach is to conduct an analytical validation of performance on a generalization dataset from one or more institutions, followed by a clinical validation study of the system's efficacy during deployment. Clinical validation studies are time-consuming, and best practices dictate limited re-use of analytical validation data, so it is ideal to know ahead of time if a system is likely to fail analytical or clinical validation. In this paper, we describe a series of sanity tests to identify when a system performs well on development data for the wrong reasons. We illustrate the sanity tests' value by designing a deep learning system to classify pancreatic cancer seen in computed tomography scans.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

3.8
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available