期刊
IEEE TRANSACTIONS ON CYBERNETICS
卷 46, 期 3, 页码 595-608出版社
IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TCYB.2015.2410143
关键词
Classification; data mining (DM); data preprocessing; discretization; evolutionary algorithms (EAs); numerical attributes
类别
资金
- Spanish Ministry of Science and Technology [TIN2011-28488]
- Andalusian Research Plans [P11-TIC-7765, P10-TIC-6858]
Discretization is one of the most relevant techniques for data preprocessing. The main goal of discretization is to transform numerical attributes into discrete ones to help the experts to understand the data more easily, and it also provides the possibility to use some learning algorithms which require discrete data as input, such as Bayesian or rule learning. We focus our attention on handling multivariate classification problems, where high interactions among multiple attributes exist. In this paper, we propose the use of evolutionary algorithms to select a subset of cut points that defines the best possible discretization scheme of a data set using a wrapper fitness function. We also incorporate a reduction mechanism to successfully manage the multivariate approach on large data sets. Our method has been compared with the best state-of-the-art discretizers on 45 real datasets. The experiments show that our proposed algorithm overcomes the rest of the methods producing competitive discretization schemes in terms of accuracy, for C4.5, Naive Bayes, PART, and PrUning and BuiLding Integrated in Classification classifiers; and obtained far simpler solutions.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据