4.6 Article

Interobserver variance and patient heterogeneity influencing the treatment of grade I spondylolisthesis

Journal

SPINE JOURNAL
Volume 20, Issue 12, Pages 1934-1939

Publisher

ELSEVIER SCIENCE INC
DOI: 10.1016/j.spinee.2020.06.001

Keywords

Fusion; Heterogeneity; Lumbar spine; Spondylolisthesis; Stenosis; Variance

Ask authors/readers for more resources

BACKGROUND CONTEXT: Despite well done randomized clinical trials, the role of fusion as an adjunct to decompression for the treatment of patients with degenerative spondylolisthesis remains controversial. There is substantial variation in the use of fusion as well as the techniques used for fusion for a population of patients all described by a single ICD10 code. PURPOSE: We sought to investigate the source of the variation in the perceived role of fusion by looking at surgeon as well as patient-specific factors. STUDY DESIGN: Prospective cohort study examining the variability of recommendations from an expert panel of surgeons-based imaging and clinical vignettes. PATIENT SAMPLE: Patients with degenerative spondylolisthesis and stenosis. OUTCOME MEASURES: A six-category treatment schema based on level of invasiveness of proposed surgeries with one through three representing nonfusion strategies and categories four through six representing fusion strategies. METHODS: The authors are conducting the ongoing spinal laminectomy vs instrumented pedicle screw II study in which patients with grade one degenerative spondylolisthesis and stenosis are randomized to two groups: a review group in which patients are treated as per recommendations of an expert panel and a nonreview group in which patients are treated as per the referring surgeon's preference. In the former (review group), clinical vignettes and radiographic studies were evaluated by an expert panel of spine surgeons. The panel then provided these recommendations to the referring surgeon. We investigated the underlying variability by looking both at the number of similar or different recommendations received by an individual patient (surgeon-related variability) as well as the number of similar or different recommendations offered by individual surgeons across the population of patients (patient heterogeneity). Agreement between surgeons for fusion vs nonfusion (Categories 1-3 vs 4-6) was calculated using a Kappa value from a mixed effects logistic regression model. We looked at Kappa for agreement and weighted Kappa for association of ratings on the ordinal 1 to 6 scale with a mixed effects linear regression model. Additionally, we analyzed the summary of data between patients after averaging the rater scores within patients. Similarly, we summarized the data between surgeons after averaging their scores over the patients that each surgeon reviewed. RESULTS: One hundred and fourteen patients received 1,463 treatment recommendations. On average, fusion was recommended 58.5% of the time. Overall agreement was low, and perfect agreement on the need for fusion was seen in only 24 (21.1%) of patients. Kappa statistic for agreement on fusion was 0.378 (95% confidence interval 0.324-0.432). The average score across surgeons was 4.2 (0.6) with a range of 3 to 5.3. The most common single recommendation was for fusion with interbody fusion (40.8%) and the lowest was for decompression with noninstrumented fusion (0.5%). CONCLUSIONS: We demonstrated variability in surgical approach when individual patients were evaluated by a panel of surgeons indicating that even expert surgeons disagree with each other regarding the need for fusion in individual patients. We were also able to demonstrate that individual patients received consistent recommendations that were very different from those received by other individuals evaluated by the same surgeons. This indicates that there is patient-related heterogeneity driving variability independent of surgeon factors. (C) 2020 Elsevier Inc. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available