4.7 Article

Benchmarking weakly-supervised deep learning pipelines for whole slide classification in computational pathology

期刊

MEDICAL IMAGE ANALYSIS
卷 79, 期 -, 页码 -

出版社

ELSEVIER
DOI: 10.1016/j.media.2022.102474

关键词

Computational pathology; Artificial intelligence; Weakly-supervised deep learning; Vision transformers; Convolutional neural networks; Multiple-Instance Learning

资金

  1. German Federal Ministry of Health [ZMVI1-2520DAT111]
  2. Max-Eder-Programme of the German Cancer Aid [70113864]
  3. German Research Foundation (DFG) [SFB CRC1382, SFB-TRR57]
  4. German Research Council [BR 1704/6-1, BR 1704/6-3, BR 1704/6-4, CH 117/1-1, HO 5117/2-1, HO 5117/2-2, HE 5998/2-1, KL 2354/3-1, RO 2270/8-1, BR 1704/17-1]
  5. German Federal Ministry of Education and Research [01KH0404, 01ER0814, 01ER0815, 01ER1505A, 01ER1505B]
  6. DFG, German Research Foundation [322900939, 454024652, 432698239, 445703531]
  7. European Research Council (ERC) [101001791]
  8. Federal Ministry of Education and Research [STOP-FSGS-01GM1901A]
  9. Federal Ministry of Economic Affairs and Energy (EMPAIA) [01MK2002A]
  10. DFG [GA 1384/3-1, GA 1384/5-1]
  11. Federal Ministry of Economic Affairs and Energy
  12. NCT Tissue Bank at the Institute of Pathology, University of Heidelberg
  13. DFG
  14. Interdisciplinary Research Program of the National Center for Tumor Diseases (NCT), Germany

向作者/读者索取更多资源

This study systematically compared the performance differences between classical weakly-supervised analysis methods and multiple-instance learning methods in clinically relevant prediction tasks. The results showed that all methods performed well in the histological tumor subtyping of renal cell carcinoma, while classical weakly-supervised methods outperformed MIL-based methods in mutation prediction tasks for colorectal, gastric, and bladder cancer.
Artificial intelligence (AI) can extract visual information from histopathological slides and yield biological insight and clinical biomarkers. Whole slide images are cut into thousands of tiles and classification problems are often weakly-supervised: the ground truth is only known for the slide, not for every single tile. In classical weakly-supervised analysis pipelines, all tiles inherit the slide label while in multiple-instance learning (MIL), only bags of tiles inherit the label. However, it is still unclear how these widely used but markedly different approaches perform relative to each other.We implemented and systematically compared six methods in six clinically relevant end-to-end prediction tasks using data from N = 2980 patients for training with rigorous external validation. We tested three classical weakly-supervised approaches with convolutional neural networks and vision transformers (ViT) and three MIL-based approaches with and without an additional attention module. Our results empirically demonstrate that histological tumor subtyping of renal cell carcinoma is an easy task in which all approaches achieve an area under the receiver operating curve (AUROC) of above 0.9. In contrast, we report significant performance differences for clinically relevant tasks of mutation prediction in colorectal, gastric, and bladder cancer. In these mutation prediction tasks, classical weakly-supervised workflows outperformed MIL-based weakly-supervised methods for mutation prediction, which is surprising given their simplicity. This shows that new end-to-end image analysis pipelines in computational pathology should be compared to classical weakly-supervised methods. Also, these findings motivate the development of new methods which combine the elegant assumptions of MIL with the empirically observed higher performance of classical weakly-supervised approaches. We make all source codes publicly available at https://github.com/KatherLab/HIA , allowing easy application of all methods to any similar task.(c) 2022 Elsevier B.V. All rights reserved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据