4.7 Article

Discovering Categorical Main and Interaction Effects Based on Association Rule Mining

Journal

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING
Volume 35, Issue 2, Pages 1379-1390

Publisher

IEEE COMPUTER SOC
DOI: 10.1109/TKDE.2021.3087343

Keywords

Feature extraction; Data mining; Computational modeling; Itemsets; Task analysis; Encoding; Databases; Apriori; association rule mining; categorical feature; feature selection; interaction

Ask authors/readers for more resources

With the increasing size of data sets, the importance of feature selection grows. Considering interactions between original features can lead to high dimensionality, especially for categorical features with one-hot encoding. Thus, mining useful features and their interactions becomes more worthwhile. Inspired by association rule mining, we propose a method that utilizes association rules to select features and their interactions, making modifications for practical concerns. Our analysis of the computational complexity demonstrates the efficiency of the proposed algorithm, and a series of experiments confirm its effectiveness.
With the growing size of data sets, feature selection becomes increasingly important. Taking interactions of original features into consideration will lead to extremely high dimension, especially when the features are categorical and one-hot encoding is applied. This makes it more worthwhile mining useful features as well as their interactions. Association rule mining aims to extract interesting correlations between items, but it is difficult to use rules as a qualified classifier themselves. Drawing inspiration from association rule mining, we come up with a method that uses association rules to select features and their interactions, then modify the algorithm for several practical concerns. We analyze the computational complexity of the proposed algorithm to show its efficiency. And the results of a series of experiments verify the effectiveness of the algorithm.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available