4.4 Article

Development and Validation of a Deep Learning System to Detect Glaucomatous Optic Neuropathy Using Fundus Photographs

Journal

JAMA OPHTHALMOLOGY
Volume 137, Issue 12, Pages 1353-1360

Publisher

AMER MEDICAL ASSOC
DOI: 10.1001/jamaophthalmol.2019.3501

Keywords

-

Categories

Funding

  1. National Natural Science Fund Projects of China [81271005]
  2. Beijing Municipal Administration of Hospitals Qingmiao Projects [QMS20180210]
  3. Priming Scientific Research Foundation for the Junior Researcher in Beijing Tongren Hospital [2016-YJJ-ZZL-021]
  4. Beijing Tongren Hospital Top Talent Training Program
  5. Medical Synergy Science and Technology Innovation Research [Z181100001918035]

Ask authors/readers for more resources

Question How does a deep learning system compare with professional human graders in detecting glaucomatous optic neuropathy? Findings In this cross-sectional study, the deep learning system showed a sensitivity and specificity of greater than 90% for detecting glaucomatous optic neuropathy in a local validation data set, in 3 clinical-based data sets, and in a real-world distribution data set. The deep learning system showed lower sensitivity when tested in multiethnic and website-based data sets. Meaning This assessment of fundus images suggests that deep learning systems can provide a tool with high sensitivity and specificity that might expedite screening for glaucomatous optic neuropathy. Importance A deep learning system (DLS) that could automatically detect glaucomatous optic neuropathy (GON) with high sensitivity and specificity could expedite screening for GON. Objective To establish a DLS for detection of GON using retinal fundus images and glaucoma diagnosis with convoluted neural networks (GD-CNN) that has the ability to be generalized across populations. Design, Setting, and Participants In this cross-sectional study, a DLS for the classification of GON was developed for automated classification of GON using retinal fundus images obtained from the Chinese Glaucoma Study Alliance, the Handan Eye Study, and online databases. The researchers selected 241 032 images were selected as the training data set. The images were entered into the databases on June 9, 2009, obtained on July 11, 2018, and analyses were performed on December 15, 2018. The generalization of the DLS was tested in several validation data sets, which allowed assessment of the DLS in a clinical setting without exclusions, testing against variable image quality based on fundus photographs obtained from websites, evaluation in a population-based study that reflects a natural distribution of patients with glaucoma within the cohort and an additive data set that has a diverse ethnic distribution. An online learning system was established to transfer the trained and validated DLS to generalize the results with fundus images from new sources. To better understand the DLS decision-making process, a prediction visualization test was performed that identified regions of the fundus images utilized by the DLS for diagnosis. Exposures Use of a deep learning system. Main Outcomes and Measures Area under the receiver operating characteristics curve (AUC), sensitivity and specificity for DLS with reference to professional graders. Results From a total of 274 413 fundus images initially obtained from CGSA, 269 601 images passed initial image quality review and were graded for GON. A total of 241 032 images (definite GON 29 865 [12.4%], probable GON 11 046 [4.6%], unlikely GON 200 121 [83%]) from 68 013 patients were selected using random sampling to train the GD-CNN model. Validation and evaluation of the GD-CNN model was assessed using the remaining 28 569 images from CGSA. The AUC of the GD-CNN model in primary local validation data sets was 0.996 (95% CI, 0.995-0.998), with sensitivity of 96.2% and specificity of 97.7%. The most common reason for both false-negative and false-positive grading by GD-CNN (51 of 119 [46.3%] and 191 of 588 [32.3%]) and manual grading (50 of 113 [44.2%] and 183 of 538 [34.0%]) was pathologic or high myopia. Conclusions and Relevance Application of GD-CNN to fundus images from different settings and varying image quality demonstrated a high sensitivity, specificity, and generalizability for detecting GON. These findings suggest that automated DLS could enhance current screening programs in a cost-effective and time-efficient manner. This cross-sectional study compares the sensitivity and specificity of automated classification of glaucomatous optic neuropathy on retinal fundus images by a deep-learning system with classification by human experts, using Chinese, multiethnic, and website-based data sets.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.4
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available