4.2 Editorial Material

Most Experts Agree horizontal ellipsis But What About Other EEG Readers?

Journal

EPILEPSY CURRENTS
Volume 20, Issue 2, Pages 78-79

Publisher

SAGE PUBLICATIONS INC
DOI: 10.1177/1535759720901511

Keywords

-

Ask authors/readers for more resources

Inter-Rater Reliability of Experts in Identifying Interictal Epileptiform Discharges in Electroencephalograms Jing J, Herlopian A, Karakis I, et al. JAMA Neurol. Published online October 21, 2019. doi: Importance: The validity of using electroencephalograms (EEGs) to diagnose epilepsy requires reliable detection of interictal epileptiform discharges (IEDs). Prior interrater reliability (IRR) studies are limited by small samples and selection bias. Objective: To assess the reliability of experts in detecting IEDs in routine EEGs. Design, Setting, and Participants: This prospective analysis conducted in 2 phases included physicians with at least 1 year of subspecialty training in clinical neurophysiology as participants. In phase 1, 9 experts independently identified candidate IEDs in 991 EEGs (1 expert per EEG) reported in the medical record to contain at least 1 IED, yielding 87 636 candidate IEDs. In phase 2, the candidate IEDs were clustered into groups with distinct morphological features, yielding 12 602 clusters, and a representative candidate IED was selected from each cluster. We added 660 waveforms (11 random samples each from 60 randomly selected EEGs reported as being free of IEDs) as negative controls. Eight experts independently scored all 13 262 candidates as IEDs or non-IEDs. The 1051 EEGs in the study were recorded at the Massachusetts General Hospital between 2012 and 2016. Main Outcomes and Measures: Primary outcome measures were percentage of agreement (PA) and beyond-chance agreement (Gwet kappa) for individual IEDs (IED-wise IRR) and for whether an EEG contained any IEDs (EEG-wise IRR). Secondary outcomes were the correlations between numbers of IEDs marked by experts across cases, calibration of expert scoring to group consensus, and receiver operating characteristic analysis of how well multivariate logistic regression models may account for differences in the IED scoring behavior between experts. Results: Among the 1051 EEGs assessed in the study, 540 (51.4%) were those of females and 511 (48.6%) were those of males. In phase 1, 9 experts each marked potential IEDs in a median of 65 (interquartile range: 28-332) EEGs. The total number of IED candidates marked was 87 636. Expert IRR for the 13 262 individually annotated IED candidates was fair, with the mean PA being 72.4% (95% confidence interval [CI]: 67.0%-77.8%) and mean kappa being 48.7% (95% CI: 37.3%-60.1%). The EEG-wise IRR was substantial, with the mean PA being 80.9% (95% CI: 76.2%-85.7%) and mean kappa being 69.4% (95% CI: 60.3%-78.5%). A statistical model based on waveform morphological features, when provided with individualized thresholds, explained the median binary scores of all experts with a high degree of accuracy of 80% (range: 73%-88%). Conclusions and Relevance: This study's findings suggest that experts can identify whether EEGs contain IEDs with substantial reliability. Lower reliability regarding individual IEDs may be largely explained by various experts applying different thresholds to a common underlying statistical model.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.2
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available