4.5 Article

Instance selection using one-versus-all and one-versus-one decomposition approaches in multiclass classification datasets

期刊

EXPERT SYSTEMS
卷 40, 期 6, 页码 -

出版社

WILEY
DOI: 10.1111/exsy.13217

关键词

data mining; instance selection; machine learning; multiclass classification; one-versus-all; one-versus-one

向作者/读者索取更多资源

Instance is crucial in data analysis and mining, and various algorithms have been proposed for its selection. The one-versus-all and one-versus-one decomposition approaches can effectively decompose multiclass datasets into binary class datasets. This study assessed and compared the performance of instance selection using the OVO, OVA, and baseline approaches and showed that the OVO approach outperforms in terms of AUC rate, data reduction rates, and processing times.
Instance is important in data analysis and mining; it filters out unrepresentative, redundant, or noisy data from a given training set to obtain effective model learning. Various instance selection algorithms are proposed in the literature, and their potential and applicability in data cleaning and preprocessing steps are demonstrated. For multiclass classification datasets, the existing instance selection algorithms must deal with all the instances across the different classes simultaneously to produce a reduced training set. Generally, every multiclass classification dataset can be regarded as a complex domain problem, which can be effectively solved using the divide-and-conquer principle. In this study, the one-versus-all (OVA) and one-versus-one (OVO) decomposition approaches were used to decompose a multiclass dataset into multiple binary class datasets. These approaches have been widely employed when constructing the classifier but have never been considered in instance selection. The results of instance selection performance obtained with the OVA, OVO, and baseline approaches were assessed and compared for 20 different domain multiclass datasets as the first study and five medical domain datasets as the validation study. Furthermore, three instance selection algorithms were compared, including IB3, DROP3, and GA. The results demonstrate that using the OVO approach to perform instance selection can make the support vector machine (SVM) and k-nearest neighbour (k-NN) classifiers perform significantly better than the OVA and baseline approaches in terms of the area under the ROC curve (AUC) rate, regardless of the instance selection algorithm used. Moreover, the OVO approach can provide reasonably good data reduction rates and processing times, which are all better than those of the OVA approach.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据