期刊
COMPUTATIONAL STATISTICS & DATA ANALYSIS
卷 98, 期 -, 页码 46-59出版社
ELSEVIER
DOI: 10.1016/j.csda.2015.12.009
关键词
High dimension; Imbalance; Classification
A binary classification problem is imbalanced when the number of samples from the two groups differs. For the high dimensional case, where the number of variables is much larger than the number of samples, imbalance leads to a bias in the classification. The independence classifier is studied theoretically and based on the analysis two new classifiers are suggested that can handle any imbalance ratio. The analytical results are supplemented by a simulation study, where the suggested classifiers in some aspects outperform multiple undersampling. For correlated data the ROAD classifier is considered and a suggestion is given for how to modify the classifier to handle the bias from imbalanced group sizes. (C) 2015 Elsevier B.V. All rights reserved.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据