☆ 4.7 Article

Variance reduction in estimating classification error using sparse datasets

CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS (2005)

期刊

CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS

卷 79, 期 1-2, 页码 91-100

出版社

ELSEVIER

DOI: 10.1016/j.chemolab.2005.04.008

关键词

error rate estimation; crossvalidation; bootstrap resampling; small sample size

类别

Automation & Control Systems Chemistry, Analytical Computer Science, Artificial Intelligence Instruments & Instrumentation Mathematics, Interdisciplinary Applications Statistics & Probability

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

In biomedical applications, frequently only a limited number of samples are available for the development and testing of classification rules. Understanding the behavior of the error estimators in this setting is therefore highly desirable. In an extensive study using simulated as well as real-life data we investigated the properties of commonly used error estimators in terms of their bias and variance, and have found that in these small-sample size situations, the influence of variance on the error estimates can be significant, and can dominate the bias. Consequently, our results strongly suggest that bootstrap resampling and/or k-fold crossvalidation-based estimators, especially when computed over multiple data splits, should be preferred in these small-sample size scenarios, because of their reduced variance compared to the more routinely used crossvalidation approaches. While linear partial least squares was used as the classifier/regressor, the general conclusions arising from this study are not qualitatively affected for other classifiers, linear or nonlinear. (c) 2005 Elsevier B.V. All rights reserved.

Variance reduction in estimating classification error using sparse datasets

期刊

CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS

出版社

ELSEVIER

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Variance reduction in estimating classification error using sparse datasets

期刊

CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS

出版社

ELSEVIER

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文