☆ 4.5 Article

Better to be in agreement than in bad company A critical analysis of many kappa-like tests

BEHAVIOR RESEARCH METHODS (2023)

Related references

Note: Only part of the references are listed.

Article Multidisciplinary Sciences

An Empirical Comparative Assessment of Inter-Rater Agreement of Binary Outcomes and Multiple Raters

Menelaos Konstantinidis et al.

Summary: This study evaluated the performance of four commonly used inter-rater agreement statistics in the context of multiple raters. The expected values of all four statistics were equal when the outcome prevalence was symmetric, but only the expected values of the three Kappa statistics were equal when the outcome prevalence was asymmetric. Fleiss' Kappa yielded a higher variance in the symmetric case, while Gwet's AC1 yielded a lower variance in the asymmetric case. The authors suggest favoring Gwet's AC1 statistic when the population-level prevalence of outcomes is unknown, and conducting transformations between statistics for direct comparisons between inter-rater agreement measures.

SYMMETRY-BASEL (2022)

Add to Collection

Article Statistics & Probability

A new revised version of McNemar's test for paired binary data

Yunqing Lu et al.

COMMUNICATIONS IN STATISTICS-THEORY AND METHODS (2017)

Add to Collection

Article Social Sciences, Mathematical Methods

Mistakes and How to Avoid Mistakes in Using Intercoder Reliability Indices

Guangchao Charles Feng

METHODOLOGY-EUROPEAN JOURNAL OF RESEARCH METHODS FOR THE BEHAVIORAL AND SOCIAL SCIENCES (2015)

Add to Collection

Article Health Care Sciences & Services

Observer agreement paradoxes in 2x2 tables: comparison of agreement measures

Viswanathan Shankar et al.

BMC MEDICAL RESEARCH METHODOLOGY (2014)

Add to Collection

Article Health Care Sciences & Services

A comparison of Cohen's Kappa and Gwet's AC1 when calculating inter-rater reliability coefficients: a study conducted with personality disorder samples

Nahathai Wongpakaran et al.

BMC MEDICAL RESEARCH METHODOLOGY (2013)

Add to Collection

Article Medicine, General & Internal

Use of relative and absolute effect measures in reporting health inequalities: structured review

Nicholas B. King et al.

BMJ-BRITISH MEDICAL JOURNAL (2012)

Add to Collection

Editorial Material Surgery

Is there still a place for Pearson's chi-squared test and Fisher's exact test in surgical research?

John Ludbrook

ANZ JOURNAL OF SURGERY (2011)

Add to Collection

Article Statistics & Probability

A Revised Version of McNemar's Test for Paired Binary Data

Yunqing Lu

COMMUNICATIONS IN STATISTICS-THEORY AND METHODS (2010)

Add to Collection

Article Mathematics, Interdisciplinary Applications

Computing inter-rater reliability and its variance in the presence of high agreement

Kilem Li Gwet

BRITISH JOURNAL OF MATHEMATICAL & STATISTICAL PSYCHOLOGY (2008)

Add to Collection

Article Computer Science, Information Systems

Agreement, the F-measure, and reliability in information retrieval

G Hripcsak et al.

JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION (2005)

Add to Collection

© Peeref 2019-2024. All rights reserved.