4.7 Article

Reliability and responsiveness of endoscopic disease activity assessment in eosinophilic esophagitis

Journal

GASTROINTESTINAL ENDOSCOPY
Volume 95, Issue 6, Pages 1126-+

Publisher

MOSBY-ELSEVIER
DOI: 10.1016/j.gie.2022.01.014

Keywords

-

Ask authors/readers for more resources

This study evaluated the operating properties of endoscopic measures in eosinophilic esophagitis (EoE) clinical trials and concluded that EREFS and its modifications are reliable and responsive, with scoring based on the worst affected area optimizing reliability and responsiveness.
Background and Aims: Endoscopic outcomes have become important measures of eosinophilic esophagitis (EoE) disease activity, including as an endpoint in randomized controlled trials (RCTs). We evaluated the operating properties of endoscopic measures for use in EoE RCTs. Methods: Modified Research and Development/University of California Los Angeles appropriateness methods and a panel of 15 international EoE experts identified endoscopic items and definitions with face validity that were used in a 2-round voting process to define simplified (all items graded as absent or present) and expanded versions (additional grades for edema, furrows, and/or exudates) of the EoE Endoscopic Reference Score (EREFS). Inter- and intrarater reliability of these instruments (expressed as intradass correlation coefficients [ICC]) were evaluated using paired endoscopy video assessments of 2 blinded central readers in patients before and after 8 weeks of proton pump inhibitors, swallowed topical corticosteroids, or dietary elimination. Responsiveness was measured using the standardized effect size (SES). Results: The appropriateness of 41 statements relevant to EoE endoscopic activity (endoscopic items, item definitions and grading, and other considerations relevant for endoscopy) was considered. The original and expanded EREFS demonstrated moderate-to-substantial inter-rater reliability (ICCs of .472-.736 and .469-.763, respectively) and moderate-to-almost perfect intrarater reliability (ICCs of .580-.828 and .581-.828, respectively). Strictures were least reliably assessed (ICC, .072-.385). The original EREFS was highly responsive (SES, 1.126 [95% confidence interval {CI}, .757-1.534]), although both expanded versions of EREFS, scored based on worst affected area, were numerically most responsive to treatment (expanded furrows: SES, 1.229 [95% CI, .858-1.643]; all items expanded: SES, 1.252 [95% CI, .880-1.667]). The EREFS and its modifications were not more reliably scored by segment and also not more responsive when proximal and distal EREFSs were summed. Conclusions: EREFS and its modifications were reliable and responsive, and the original or expanded versions of the EREFS may be preferred in RCTs. Disease activity scored based on the worst affected area optimizes reliability and responsiveness.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available