☆ 4.7 Article

A novel measure of attribute significance with complexity weight

APPLIED SOFT COMPUTING (2019)

Journal

APPLIED SOFT COMPUTING

Volume 82, Issue -, Pages -

Publisher

ELSEVIER

DOI: 10.1016/j.asoc.2019.105543

Keywords

Attribute reduction; Attribute significance; Structural risk minimization; Complexity weight; Generalization ability

Funding

National Key R&D Program of China [2017YFB0902100]
National Science and Technology Major Project of China [2017-I-0007-0008]

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

Attribute reduction is one of the most important problems in rough set theory. Conventional attribute reduction algorithms are based on minimal errors in seen objects, namely empirical risk minimization. Classification ability in unseen objects, namely generalization ability is more important in actual applications. Therefore, a good reduct should have good generalization ability. Structural risk minimization (SRM) inductive principle is an effective tool to control the generalization ability of learning machines, which considers complexity and errors in seen objects simultaneously. Therefore, this paper introduces the SRM principle into the definition of attribute significance, proposes that the number of rules can characterize the actual complexity of the rough set-based classifier effectively and defines a novel measure of attribute significance with complexity weight. Based on the new attribute significance, a new heuristic attribute reduction algorithm called HSRM-R algorithm is developed. The 10-fold cross-validation experiments in 21 UCI datasets show that HSRM-R algorithm obtains better generalization ability than conventional attribute reduction algorithms based on dependency degree, information entropy, Fisher score and Laplacian score. Further experiments show that HSRM-R algorithm obtains fewer rules and larger support coefficient. This means HSRM-R algorithm can extract stronger rules, which explains why it has better generalization ability to some extent. Although HSRM-R algorithm consumes more time than conventional algorithms, it obtains optimal classification accuracy in almost all datasets used in the experiments. Thus, the proposed HSRM-R algorithm provides an approach to guaranteeing the generalization ability theoretically in the case where users require high classification accuracy. (C) 2019 Elsevier B.V. All rights reserved.

A novel measure of attribute significance with complexity weight

Journal

APPLIED SOFT COMPUTING

Publisher

ELSEVIER

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

A novel measure of attribute significance with complexity weight

Journal

APPLIED SOFT COMPUTING

Publisher

ELSEVIER

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper