4.8 Article

Central Reading of Ulcerative Colitis Clinical Trial Videos Using Neural Networks

期刊

GASTROENTEROLOGY
卷 160, 期 3, 页码 710-+

出版社

W B SAUNDERS CO-ELSEVIER INC
DOI: 10.1053/j.gastro.2020.10.024

关键词

Machine Learning; Computer Vision; Endoscopic Scores; Efficacy End Points

资金

  1. Eli Lilly and Company, Indianapolis, Indiana

向作者/读者索取更多资源

A deep learning algorithm was successfully trained to predict levels of UC severity from full-length endoscopy videos, with excellent agreement metrics compared to human central readers. Prospective data collection from a multinational clinical trial, use of videos instead of still images, and reporting of UCEIS and eMS all contributed to the success of the machine learning algorithm in predicting UC severity.
BACKGROUND AND AIMS: Endoscopic disease activity scoring in ulcerative colitis (UC) is useful in clinical practice but done infrequently. It is required in clinical trials, where it is expensive and slow because human central readers are needed. A machine learning algorithm automating the process could elevate clinical care and facilitate clinical research. Prior work using single-institution databases and endoscopic still images has been promising. METHODS: Seven hundred and ninety-five full-length endoscopy videos were prospectively collected from a phase 2 trial of mirikizumab with 249 patients from 14 countries, totaling 19.5 million image frames. Expert central readers assigned each full-length endoscopy videos 1 endoscopic Mayo score (eMS) and 1 Ulcerative Colitis Endoscopic Index of Severity (UCEIS) score. Initially, video data were cleaned and abnormality features extracted using convolutional neural networks. Subsequently, a recurrent neural network was trained on the features to predict eMS and UCEIS from individual full-length endoscopy videos. RESULTS: The primary metric to assess the performance of the recurrent neural network model was quadratic weighted kappa (QWK) comparing the agreement of the machine-read endoscopy score with the human central reader score. QWK progressively penalizes disagreements that exceed 1 level. The model's agreement metric was excellent, with a QWK of 0.844 (95% confidence interval, 0.787-0.901) for eMS and 0.855 (95% confidence interval, 0.80-0.91) for UCEIS. CONCLUSIONS: We found that a deep learning algorithm can be trained to predict levels of UC severity from full-length endoscopy videos. Our data set was prospectively collected in a multinational clinical trial, videos rather than still images were used, UCEIS and eMS were reported, and machine learning algorithm performance metrics met or exceeded those previously published for UC severity scores.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.8
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据