期刊
DATA MINING AND KNOWLEDGE DISCOVERY
卷 21, 期 2, 页码 259-276出版社
SPRINGER
DOI: 10.1007/s10618-010-0187-5
关键词
Exceptional Model Mining; Subgroup Discovery; Information theory
We introduce a new approach to Exceptional Model Mining. Our algorithm, called EMDM, is an iterative method that alternates between Exception Maximisation and Description Minimisation. As a result, it finds maximally exceptional models with minimal descriptions. Exceptional Model Mining was recently introduced by Leman et al. (Exceptional model mining 1-16, 2008) as a generalisation of Subgroup Discovery. Instead of considering a single target attribute, it allows for multiple 'model' attributes on which models are fitted. If the model for a subgroup is substantially different from the model for the complete database, it is regarded as an exceptional model. To measure exceptionality, we propose two information-theoretic measures. One is based on the Kullback-Leibler divergence, the other on Krimp. We show how compression can be used for exception maximisation with these measures, and how classification can be used for description minimisation. Experiments show that our approach efficiently identifies subgroups that are both exceptional and interesting.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据