4.7 Article

Validation of a natural language processing algorithm to identify adenomas and measure adenoma detection rates across a health system: a population-level study

Journal

GASTROINTESTINAL ENDOSCOPY
Volume 97, Issue 1, Pages 121-+

Publisher

MOSBY-ELSEVIER
DOI: 10.1016/j.gie.2022.07.009

Keywords

-

Ask authors/readers for more resources

This study developed and validated an NLP algorithm to identify colorectal adenomas and report ADR at the population level in Ontario, Canada. The algorithm showed high sensitivity and specificity in identifying adenomas, and it was accurately used to measure ADR at the system level.
Background and Aims: Measuring adenoma detection rates (ADRs) at the population level is challenging because pathology reports are often reported in an unstructured format; further, there is significant variation in reporting methods across institutions. Natural language processing (NLP) can be used to extract relevant information from text-based records. We aimed to develop and validate an NLP algorithm to identify colorectal adenomas that could be used to report ADR at the population level in Ontario, Canada. Methods: The sampling frame included pathology reports from all colonoscopies performed in Ontario in 2015 and 2016. Two random samples of 450 and 1000 reports were selected as the training and validation sets, respectively. Expert clinicians reviewed and classified reports as adenoma or other. The training set was used to develop an NLP algorithm (to identify adenomas) that was evaluated using the validation set. The NLP algorithm test characteristics were calculated using expert review as the reference. We used the algorithm to measure ADR for all endoscopists in Ontario in 2019. Results: The 1450 pathology reports were derived from 62 laboratories, 266 pathologists, and 532 endoscopists. In the training set, the NLP algorithm for any adenoma had a sensitivity of 99.60% (95% confidence interval (CI), 97.77-99.99), specificity of 99.01% (95% CI, 96.49-99.88), positive predictive value of 99.19% (95% CI, 97.1299.90), and F1 score of.99. Similar results were obtained for the validation set. The median ADR was 33% (interquartile range, 26%-40%). Conclusions: When we used a population-based sample from Ontario, our NLP algorithm was highly accurate and was used at the system level to measure ADR. (Gastrointest Endosc 2023;97:121-9.)

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available