☆ 4.5 Article

Optimal two-phase sampling for estimating the area under the receiver operating characteristic curve

STATISTICS IN MEDICINE (2021)

Journal

STATISTICS IN MEDICINE

Volume 40, Issue 4, Pages 1059-1071

Publisher

WILEY

DOI: 10.1002/sim.8819

Keywords

area under a ROC curve; one‐ phase random sampling; optimal sampling probabilities; relative efficiency; two‐ phase sampling

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

This study explores the optimal two-phase sampling design for evaluating the performance of an ordinal test in classifying disease status. Simulation results show that two-phase sampling under optimal probabilities can substantially reduce the variance of the AUC estimator, with oversampling of subjects at low and high ordinal levels. Compared to proportional allocation, this approach improves efficiency.

Statistical methods are well developed for estimating the area under the receiver operating characteristic curve (AUC) based on a random sample where the gold standard is available for every subject in the sample, or a two-phase sample where the gold standard is ascertained only at the second phase for a subset of subjects sampled using fixed sampling probabilities. However, the methods based on a two-phase sample do not attempt to optimize the sampling probabilities to minimize the variance of AUC estimator. In this paper, we consider the optimal two-phase sampling design for evaluating the performance of an ordinal test in classifying disease status. We derived the analytic variance formula for the AUC estimator and used it to obtain the optimal sampling probabilities. The efficiency of the two-phase sampling under the optimal sampling probabilities (OA) is evaluated by a simulation study, which indicates that two-phase sampling under OA achieves a substantial amount of variance reduction with an over-sample of subjects with low and high ordinal levels, compared with two-phase sampling under proportional allocation (PA). Furthermore, in comparison with an one-phase random sampling, two-phase sampling under OA or PA have a clear advantage in reducing the variance of AUC estimator when the variance of diagnostic test results in the disease population is small relative to its counterpart in nondisease population. Finally, we applied the optimal two-phase sampling design to a real-world example to evaluate the performance of a questionnaire score in screening for childhood asthma.

Optimal two-phase sampling for estimating the area under the receiver operating characteristic curve

Journal

STATISTICS IN MEDICINE

Publisher

WILEY

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Optimal two-phase sampling for estimating the area under the receiver operating characteristic curve

Journal

STATISTICS IN MEDICINE

Publisher

WILEY

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper