☆ 4.6 Article

Agreement, the F-measure, and reliability in information retrieval

JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION (2005)

Journal

JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION

Volume 12, Issue 3, Pages 296-298

Publisher

HANLEY & BELFUS INC

DOI: 10.1197/jamia.M1733

Keywords

Funding

NLM NIH HHS [R01 LM06919, N01 LM07079] Funding Source: Medline

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

Information retrieval studies that involve searching the Internet or marking phrases usually lack a well-defined number of negative cases. This prevents the use of traditional interrater reliability metrics like the K statistic to assess the quality of expert-generated gold standards. Such studies often quantify system performance as precision, recall, and F-measure, or as agreement. It can be shown that the average F-measure among pairs of experts is numerically identical to the average positive specific agreement among experts and that K approaches these measures as the number of negative cases grows large. Positive specific agreement-or the equivalent F-measure may be an appropriate way to quantify interrater reliability and therefore to assess the reliability of a gold standard in these studies.

Agreement, the F-measure, and reliability in information retrieval

Journal

JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION

Publisher

HANLEY & BELFUS INC

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Agreement, the F-measure, and reliability in information retrieval

Journal

JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION

Publisher

HANLEY & BELFUS INC

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper