4.1 Review

Test accuracy of artificial intelligence-based grading of fundus images in diabetic retinopathy screening: A systematic review

期刊

JOURNAL OF MEDICAL SCREENING
卷 30, 期 3, 页码 97-112

出版社

SAGE PUBLICATIONS LTD
DOI: 10.1177/09691413221144382

关键词

Artificial intelligence; fundus imaging; sensitivity and specificity; systematic review; diabetic retinopathy; screening

向作者/读者索取更多资源

This study systematically reviewed the accuracy of AI-based systems for grading fundus images in DR screening. The results showed that AI-based systems are more sensitive but variable in specificity compared to human graders. However, the evidence for many systems is limited and may not generalize across different settings.
Objectives To systematically review the accuracy of artificial intelligence (AI)-based systems for grading of fundus images in diabetic retinopathy (DR) screening. Methods We searched MEDLINE, EMBASE, the Cochrane Library and the ClinicalTrials.gov from 1st January 2000 to 27th August 2021. Accuracy studies published in English were included if they met the pre-specified inclusion criteria. Selection of studies for inclusion, data extraction and quality assessment were conducted by one author with a second reviewer independently screening and checking 20% of titles. Results were analysed narratively. Results Forty-three studies evaluating 15 deep learning (DL) and 4 machine learning (ML) systems were included. Nine systems were evaluated in a single study each. Most studies were judged to be at high or unclear risk of bias in at least one QUADAS-2 domain. Sensitivity for referable DR and higher grades was >= 85% while specificity varied and was <80% for all ML systems and in 6/31 studies evaluating DL systems. Studies reported high accuracy for detection of ungradable images, but the latter were analysed and reported inconsistently. Seven studies reported that AI was more sensitive but less specific than human graders. Conclusions AI-based systems are more sensitive than human graders and could be safe to use in clinical practice but have variable specificity. However, for many systems evidence is limited, at high risk of bias and may not generalise across settings. Therefore, pre-implementation assessment in the target clinical pathway is essential to obtain reliable and applicable accuracy estimates.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.1
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据