4.8 Article

Predicting Incident Adenocarcinoma of the Esophagus or Gastric Cardia Using Machine Learning of Electronic Health Records

Journal

GASTROENTEROLOGY
Volume 165, Issue 6, Pages 1420-+

Publisher

W B SAUNDERS CO-ELSEVIER INC
DOI: 10.1053/j.gastro.2023.08.011

Keywords

Electronic Health Records; Esophageal Neoplasms; Stomach Neoplasms; Gastroesophageal Reflux Disease; Mass Screening

Ask authors/readers for more resources

The K-ECAN tool was developed to predict the occurrence of esophageal adenocarcinoma and gastric cardia adenocarcinoma using electronic health records data, and it showed better discrimination than other validated models. However, further validation and implementation in electronic health records are needed.
BACKGROUND & AIMS: Tools that can automatically predict incident esophageal adenocarcinoma (EAC) and gastric cardia adenocarcinoma (GCA) using electronic health records to guide screening decisions are needed.METHODS: The Veterans Health Administration (VHA) Corporate Data Warehouse was accessed to identify Veterans with 1 or more encounters between 2005 and 2018. Patients diagnosed with EAC (n = 8430) or GCA (n = 2965) were identified in the VHA Central Cancer Registry and compared with 10,256,887 controls. Predictors included demographic characteristics, prescriptions, laboratory results, and diagnoses between 1 and 5 years before the index date. The Kettles Esophageal and Cardia Adenocarcinoma predictioN (K-ECAN) tool was developed and internally validated using simple random sampling imputation and extreme gradient boosting, a machine learning method. Training was performed in 50% of the data, preliminary validation in 25% of the data, and final testing in 25% of the data.RESULTS: KECAN was well-calibrated and had better discrimination (area under the receiver operating characteristic curve [AuROC], 0.77) than previously validated models, such as the NordTrondelag Health Study (AuROC, 0.68) and Kunzmann model (AuROC, 0.64), or published guidelines. Using only data from between 3 and 5 years before index diminished its accuracy slightly (AuROC, 0.75). Undersampling men to simulate a nonVHA population, AUCs of the Nord-Trondelag Health Study and Kunzmann model improved, but K-ECAN was still the most accurate (AuROC, 0.85). Although gastroesophageal reflux disease was strongly associated with EAC, it contributed only a small proportion of gain in information for prediction.CONCLUSIONS: K-ECAN is a novel, internally validated tool predicting incident EAC and GCA using electronic health records data. Further work is needed to validate K-ECAN outside VHA and to assess how best to implement it within electronic health records.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.8
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available