4.2 Article

Reducing the overfitting in the gROC curve estimation

Journal

COMPUTATIONAL STATISTICS
Volume -, Issue -, Pages -

Publisher

SPRINGER HEIDELBERG
DOI: 10.1007/s00180-023-01344-6

Keywords

Binary classification problem; Cross-validation; Diagnostic problem; gROC curve; Overfitting

Ask authors/readers for more resources

The generalized receiver-operating characteristic (gROC) curve is used to evaluate diagnostic tests that have different associations with positive results for higher and lower marker values. However, using the same data to select classification subsets and calculate the gROC curve leads to over-optimistic estimates of future performance. This study explores the bias of the empirical gROC curve estimator and proposes cross-validation based algorithms to reduce overfitting, improving the estimation of diagnostic accuracy.
The generalized receiver-operating characteristic, gROC, curve considers the classification ability of diagnostic tests when both larger and lower values of the marker are associated with higher probabilities of being positive. Its empirical estimation implies to select the best classification subsets among those satisfying particular condition. Both strong and weak consistency have already been proved. However, using the same data for both to select the classification subsets and to calculate its gROC curve leads to an over-optimistic estimate of the real performance of the diagnostic criteria on future samples. In this work, the bias of the empirical gROC curve estimator is explored through Monte Carlo simulations. Besides, two cross validation based algorithms are proposed for reducing the overfitting. The practical application of the proposed algorithms is illustrated through the analysis of a real world dataset. Simulation results suggest that the empirical gROC curve estimator returns optimistic approximations, especially, in situations in which the diagnostic capacity of the marker is poor and the sample size is small. The new proposed algorithms improve the estimation of the actual diagnostic test accuracy, and get almost unbiased gAUCs in most of the considered scenarios. However, the cross-validation based algorithms reported larger L-1-errors than the standard empirical estimators, and increment the computational cost of the procedures. As online supplementary material, this manuscript includes an R function which wraps up the implemented routines.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.2
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available