4.7 Article

Population-wide evaluation of artificial intelligence and radiologist assessment of screening mammograms

Journal

EUROPEAN RADIOLOGY
Volume -, Issue -, Pages -

Publisher

SPRINGER
DOI: 10.1007/s00330-023-10423-7

Keywords

Mammography; Breast cancer; Artificial intelligence; Screening

Ask authors/readers for more resources

The objective of this study was to validate the accuracy of a standalone AI system for breast cancer detection on an entire screening population, compared to first-reading breast radiologists.
Objectives To validate an AI system for standalone breast cancer detection on an entire screening population in comparison to first-reading breast radiologists.Materials and methods All mammography screenings performed between August 4, 2014, and August 15, 2018, in the Region of Southern Denmark with follow-up within 24 months were eligible. Screenings were assessed as normal or abnormal by breast radiologists through double reading with arbitration. For an AI decision of normal or abnormal, two AI-score cut-off points were applied by matching at mean sensitivity (AI(sens)) and specificity (AI(spec)) of first readers. Accuracy measures were sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and recall rate (RR).Results The sample included 249,402 screenings (149,495 women) and 2033 breast cancers (72.6% screen-detected cancers, 27.4% interval cancers). AI(sens) had lower specificity (97.5% vs 97.7%; p < 0.0001) and PPV (17.5% vs 18.7%; p = 0.01) and a higher RR (3.0% vs 2.8%; p < 0.0001) than first readers. AI(spec) was comparable to first readers in terms of all accuracy measures. Both AI(sens) and AI(spec) detected significantly fewer screen-detected cancers (1166 (AI(sens)), 1156 (AI(spec)) vs 1252; p < 0.0001) but found more interval cancers compared to first readers (126 (AI(sens)), 117 (AI(spec)) vs 39; p < 0.0001) with varying types of cancers detected across multiple subgroups.Conclusion Standalone AI can detect breast cancer at an accuracy level equivalent to the standard of first readers when the AI threshold point was matched at first reader specificity. However, AI and first readers detected a different composition of cancers.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available