4.7 Article

A novel measure of attribute significance with complexity weight

Journal

APPLIED SOFT COMPUTING
Volume 82, Issue -, Pages -

Publisher

ELSEVIER
DOI: 10.1016/j.asoc.2019.105543

Keywords

Attribute reduction; Attribute significance; Structural risk minimization; Complexity weight; Generalization ability

Funding

  1. National Key R&D Program of China [2017YFB0902100]
  2. National Science and Technology Major Project of China [2017-I-0007-0008]

Ask authors/readers for more resources

Attribute reduction is one of the most important problems in rough set theory. Conventional attribute reduction algorithms are based on minimal errors in seen objects, namely empirical risk minimization. Classification ability in unseen objects, namely generalization ability is more important in actual applications. Therefore, a good reduct should have good generalization ability. Structural risk minimization (SRM) inductive principle is an effective tool to control the generalization ability of learning machines, which considers complexity and errors in seen objects simultaneously. Therefore, this paper introduces the SRM principle into the definition of attribute significance, proposes that the number of rules can characterize the actual complexity of the rough set-based classifier effectively and defines a novel measure of attribute significance with complexity weight. Based on the new attribute significance, a new heuristic attribute reduction algorithm called HSRM-R algorithm is developed. The 10-fold cross-validation experiments in 21 UCI datasets show that HSRM-R algorithm obtains better generalization ability than conventional attribute reduction algorithms based on dependency degree, information entropy, Fisher score and Laplacian score. Further experiments show that HSRM-R algorithm obtains fewer rules and larger support coefficient. This means HSRM-R algorithm can extract stronger rules, which explains why it has better generalization ability to some extent. Although HSRM-R algorithm consumes more time than conventional algorithms, it obtains optimal classification accuracy in almost all datasets used in the experiments. Thus, the proposed HSRM-R algorithm provides an approach to guaranteeing the generalization ability theoretically in the case where users require high classification accuracy. (C) 2019 Elsevier B.V. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available