4.7 Article

Bridging Causal Relevance and Pattern Discriminability: Mining Emerging Patterns from High-Dimensional Data

Journal

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING
Volume 25, Issue 12, Pages 2721-2739

Publisher

IEEE COMPUTER SOC
DOI: 10.1109/TKDE.2012.218

Keywords

Emerging patterns; causal Bayesian networks; causal relevance; EP discriminability

Funding

  1. National 863 Program of China [2012AA011005]
  2. National 973 Program of China [2013CB329604]
  3. National Natural Science Foundation of China [61229301, 61070131, 61175051, 61005007]
  4. US National Science Foundation [CCF-0905337]
  5. US NASA Research Award [NNX09AK86G]

Ask authors/readers for more resources

It is a nontrivial task to build an accurate emerging pattern (EP) classifier from high-dimensional data because we inevitably face two challenges 1) how to efficiently extract a minimal set of strongly predictive EPs from an explosive number of candidate patterns, and 2) how to handle the highly sensitive choice of the minimal support threshold. To address these two challenges, we bridge causal relevance and EP discriminability (the predictive ability of emerging patterns) to facilitate EP mining and propose a new framework of mining EPs from high-dimensional data. In this framework, we study the relationships between causal relevance in a causal Bayesian network and EP discriminability in EP mining, and then reduce the pattern space of EP mining to direct causes and direct effects, or the Markov blanket (MB) of the class attribute in a causal Bayesian network. The proposed framework is instantiated by two EPs-based classifiers, CE-EP and MB-EP, where CE stands for direct Causes and direct Effects, and MB for Markov Blanket. Extensive experiments on a broad range of data sets validate the effectiveness of the CE-EP and MB-EP classifiers against other well-established methods, in terms of predictive accuracy, pattern numbers, running time, and sensitivity analysis.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available