4.7 Article

Variance reduction in estimating classification error using sparse datasets

期刊

出版社

ELSEVIER
DOI: 10.1016/j.chemolab.2005.04.008

关键词

error rate estimation; crossvalidation; bootstrap resampling; small sample size

向作者/读者索取更多资源

In biomedical applications, frequently only a limited number of samples are available for the development and testing of classification rules. Understanding the behavior of the error estimators in this setting is therefore highly desirable. In an extensive study using simulated as well as real-life data we investigated the properties of commonly used error estimators in terms of their bias and variance, and have found that in these small-sample size situations, the influence of variance on the error estimates can be significant, and can dominate the bias. Consequently, our results strongly suggest that bootstrap resampling and/or k-fold crossvalidation-based estimators, especially when computed over multiple data splits, should be preferred in these small-sample size scenarios, because of their reduced variance compared to the more routinely used crossvalidation approaches. While linear partial least squares was used as the classifier/regressor, the general conclusions arising from this study are not qualitatively affected for other classifiers, linear or nonlinear. (c) 2005 Elsevier B.V. All rights reserved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据