4.5 Article

Development of a Deep Learning Algorithm for the Histopathologic Diagnosis and Gleason Grading of Prostate Cancer Biopsies: A Pilot Study

Journal

EUROPEAN UROLOGY FOCUS
Volume 7, Issue 2, Pages 347-351

Publisher

ELSEVIER
DOI: 10.1016/j.euf.2019.11.003

Keywords

Machine learning; Deep learning; Prostate cancer; Diagnosis; Gleason grade

Funding

  1. NIGMS/Advance-CTR through IDeA-CTR grant [U54GM115677]
  2. Carney Institute for Brain Sciences
  3. Center for Vision Research
  4. Center for Computation and Visualization

Ask authors/readers for more resources

A state-of-the-art deep learning algorithm was developed for the histopathologic diagnosis and Gleason grading of prostate biopsy specimens, achieving 91.5% accuracy in coarse classification and 85.4% accuracy in fine classification. The algorithm showed excellent performance with high sensitivity and specificity, though limitations include the small sample size and the need for external validation.
Background: The pathologic diagnosis and Gleason grading of prostate cancer are timeconsuming, error-prone, and subject to interobserver variability. Machine learning offers opportunities to improve the diagnosis, risk stratification, and prognostication of prostate cancer. Objective: To develop a state-of-the-art deep learning algorithm for the histopathologic diagnosis and Gleason grading of prostate biopsy specimens. Design, setting, and participants: A total of 85 prostate core biopsy specimens from 25 patients were digitized at 20x magnification and annotated for Gleason 3, 4, and 5 prostate adenocarcinomabya urologic pathologist. From these virtual slides, we sampled 14 803 image patches of 256 x 256 pixels, approximately balanced for malignancy. Outcome measurements and statistical analysis: We trained and tested a deep residual convolutional neural network to classify each patch at two levels: (1) coarse (benign vs malignant) and (2) fine (benign vs Gleason 3 vs 4 vs 5). Model performance was evaluated using fivefold cross-validation. Randomization tests were used for hypothesis testing of model performance versus chance. Results and limitations: The model demonstrated 91.5% accuracy (p < 0.001) at coarselevel classification of image patches as benign versus malignant (0.93 sensitivity, 0.90 specificity, and 0.95 average precision). The model demonstrated 85.4% accuracy (p < 0.001) at fine-level classification of image patches as benign versus Gleason 3 versus Gleason 4 versus Gleason 5 (0.83 sensitivity, 0.94 specificity, and 0.83 average precision), with the greatest number of confusions in distinguishing between Gleason 3 and 4, and between Gleason 4 and 5. Limitations include the small sample size and the need for external validation. Conclusions: In this study, a deep learning-based computer vision algorithm demonstrated excellent performance for the histopathologic diagnosis and Gleason grading of prostate cancer. Patient summary: We developed a deep learning algorithm that demonstrated excellent performance for the diagnosis and grading of prostate cancer. (C) 2019 European Association of Urology. Published by Elsevier B.V. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available