☆ 4.4 Article

FDA-approved deep learning software application versus radiologists with different levels of expertise: detection of intracranial hemorrhage in a retrospective single-center study

NEURORADIOLOGY (2022)

期刊

NEURORADIOLOGY

卷 64, 期 5, 页码 981-990

出版社

SPRINGER

DOI: 10.1007/s00234-021-02874-w

关键词

Artificial intelligence; Deep learning; Intracranial hemorrhage; Computed tomography; Diagnostic accuracy

类别

Clinical Neurology Neuroimaging Radiology, Nuclear Medicine & Medical Imaging

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

An FDA-approved and CE-certified deep learning software application was found to be less accurate than a resident in detecting intracranial hemorrhages. The importance of thoughtful workflow integration and post-approval validation of AI applications in various clinical environments was highlighted.

Purpose To assess an FDA-approved and CE-certified deep learning (DL) software application compared to the performance of human radiologists in detecting intracranial hemorrhages (ICH). Methods Within a 20-week trial from January to May 2020, 2210 adult non-contrast head CT scans were performed in a single center and automatically analyzed by an artificial intelligence (AI) solution with workflow integration. After excluding 22 scans due to severe motion artifacts, images were retrospectively assessed for the presence of ICHs by a second-year resident and a certified radiologist under simulated time pressure. Disagreements were resolved by a subspecialized neuroradiologist serving as the reference standard. We calculated interrater agreement and diagnostic performance parameters, including the Breslow-Day and Cochran-Mantel-Haenszel tests. Results An ICH was present in 214 out of 2188 scans. The interrater agreement between the resident and the certified radiologist was very high (kappa = 0.89) and even higher (kappa = 0.93) between the resident and the reference standard. The software has delivered 64 false-positive and 68 false-negative results giving an overall sensitivity, specificity, positive predictive value, negative predictive value, and accuracy of 68.2%, 96.8%, 69.5%, 96.6%, and 94.0%, respectively. Corresponding values for the resident were 94.9%, 99.2%, 93.1%, 99.4%, and 98.8%. The accuracy of the DL application was inferior (p < 0.001) to that of both the resident and the certified neuroradiologist. Conclusion A resident under time pressure outperformed an FDA-approved DL program in detecting ICH in CT scans. Our results underline the importance of thoughtful workflow integration and post-approval validation of AI applications in various clinical environments.

FDA-approved deep learning software application versus radiologists with different levels of expertise: detection of intracranial hemorrhage in a retrospective single-center study

期刊

NEURORADIOLOGY

出版社

SPRINGER

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

FDA-approved deep learning software application versus radiologists with different levels of expertise: detection of intracranial hemorrhage in a retrospective single-center study

期刊

NEURORADIOLOGY

出版社

SPRINGER

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文