4.5 Article

Better to be in agreement than in bad company A critical analysis of many kappa-like tests

Related references

Note: Only part of the references are listed.
Article Multidisciplinary Sciences

An Empirical Comparative Assessment of Inter-Rater Agreement of Binary Outcomes and Multiple Raters

Menelaos Konstantinidis et al.

Summary: This study evaluated the performance of four commonly used inter-rater agreement statistics in the context of multiple raters. The expected values of all four statistics were equal when the outcome prevalence was symmetric, but only the expected values of the three Kappa statistics were equal when the outcome prevalence was asymmetric. Fleiss' Kappa yielded a higher variance in the symmetric case, while Gwet's AC1 yielded a lower variance in the asymmetric case. The authors suggest favoring Gwet's AC1 statistic when the population-level prevalence of outcomes is unknown, and conducting transformations between statistics for direct comparisons between inter-rater agreement measures.

SYMMETRY-BASEL (2022)

Article Statistics & Probability

A new revised version of McNemar's test for paired binary data

Yunqing Lu et al.

COMMUNICATIONS IN STATISTICS-THEORY AND METHODS (2017)

Article Social Sciences, Mathematical Methods

Mistakes and How to Avoid Mistakes in Using Intercoder Reliability Indices

Guangchao Charles Feng

METHODOLOGY-EUROPEAN JOURNAL OF RESEARCH METHODS FOR THE BEHAVIORAL AND SOCIAL SCIENCES (2015)

Article Health Care Sciences & Services

Observer agreement paradoxes in 2x2 tables: comparison of agreement measures

Viswanathan Shankar et al.

BMC MEDICAL RESEARCH METHODOLOGY (2014)

Article Medicine, General & Internal

Use of relative and absolute effect measures in reporting health inequalities: structured review

Nicholas B. King et al.

BMJ-BRITISH MEDICAL JOURNAL (2012)

Article Statistics & Probability

A Revised Version of McNemar's Test for Paired Binary Data

Yunqing Lu

COMMUNICATIONS IN STATISTICS-THEORY AND METHODS (2010)

Article Mathematics, Interdisciplinary Applications

Computing inter-rater reliability and its variance in the presence of high agreement

Kilem Li Gwet

BRITISH JOURNAL OF MATHEMATICAL & STATISTICAL PSYCHOLOGY (2008)

Article Computer Science, Information Systems

Agreement, the F-measure, and reliability in information retrieval

G Hripcsak et al.

JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION (2005)