4.6 Article

Quasi-continuous and discrete confidence rating scales for observer performance studies: Effects on ROC analysis

Journal

ACADEMIC RADIOLOGY
Volume 14, Issue 1, Pages 38-48

Publisher

ELSEVIER SCIENCE INC
DOI: 10.1016/j.acra.2006.09.048

Keywords

computer-aided diagnosis; continuous and discrete confidence rating scales; ROC observer study; classification; mammography

Funding

  1. NCI NIH HHS [R01 CA095153, CA95153, R01 CA095153-05] Funding Source: Medline
  2. NATIONAL CANCER INSTITUTE [R01CA095153] Funding Source: NIH RePORTER

Ask authors/readers for more resources

Rationale and Objectives. To examine the effects of the number of categories in the rating scale used in an observer experiment on the results of ROC analysis by a simulation study. Materials and Methods. We have previously evaluated the effects of computer-aided diagnosis on radiologists' characterization of malignant and benign breast masses in serial mammograms. The evaluation of the likelihood of malignancy was performed on a quasi-continuous (0-100 points) confidence rating scale. In this study, we simulated the use of discrete confidence rating scales with fewer number of categories and analyzed the results with receiver operating characteristic (ROC) methodology. The observers' estimates of the likelihood of malignancy were also mapped to BI-RADS assessments with five and seven categories and ROC analysis was per-formed. The area under the ROC curve and the partial area index obtained from ROC analysis of the different confidence rating scales were compared. Results. The fitted ROC curves and the performance indices do not change significantly when the confidence rating scales were varied from 6 to 101 points if the estimated operating points obtained directly from the data are distributed relatively evenly over the entire range of true-positive fraction (TPF) and false-positive fraction (FPF). The mapping of the likelihood of malignancy observer data to the seven-category BI-RADS assessment scale allowed reliable ROC analysis, whereas mapping to the five-category BI-RADS scale could cause erratic ROC curve fitting because of the lack of operating points in the mid-range or failure in ROC curve fitting because of data degeneration for some observers. Conclusion. ROC analysis of discrete confidence rating scales with few but relatively evenly distributed data points over the entire FPF and TPF range is comparable to that of a quasi-continuous rating scale. However, ROC analysis of discrete confidence rating scales with few and unevenly distributed data points may cause unreliable estimations.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available