4.4 Article

The Effect of the Raters' Marginal Distributions on Their Matched Agreement: A Rescaling Framework for Interpreting Kappa

Journal

MULTIVARIATE BEHAVIORAL RESEARCH
Volume 48, Issue 6, Pages 923-952

Publisher

ROUTLEDGE JOURNALS, TAYLOR & FRANCIS LTD
DOI: 10.1080/00273171.2013.830064

Keywords

-

Ask authors/readers for more resources

Cohen's measures the improvement in classification above chance level and it is the most popular measure of interjudge agreement. Yet, there is considerable confusion about its interpretation. Specifically, researchers often ignore the fact that the observed level of matched agreement is bounded from above and below and the bounds are a function of the particular marginal distributions of the table. We propose that these bounds should be used to rescale the components of (observed and expected agreement). Rescaling in this manner results in , a measure that was originally proposed by Cohen (1960) and was largely ignored in both research and practice. This measure provides a common scale for agreement measures of tables with different marginal distributions. It reaches the maximal value of 1 when the judges show the highest level of agreement possible, given their marginal disagreements. We conclude that should be used to measure the level of matched agreement contingent on a particular set of marginal distributions. The article provides a framework and a set of guidelines that facilitate comparisons between various types of agreement tables. We illustrate our points with simulations and real data from two studiesone involving judges' ratings of baseball players and one involving ratings of essays in high-stakes tests.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.4
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available