4.5 Article

Finding the most interesting correlations in a database: how hard can it be?

Journal

INFORMATION SYSTEMS
Volume 30, Issue 1, Pages 21-46

Publisher

PERGAMON-ELSEVIER SCIENCE LTD
DOI: 10.1016/j.is.2003.08.004

Keywords

-

Ask authors/readers for more resources

This paper addresses some of the foundational issues associated with discovering the best few correlations from a database. Specifically, we consider the computational complexity of various definitions of the top-k correlation problem. where the goal is to discover the few sets of events whose co-occurrence exhibits the smallest degree of independence. Our results show that many rigorous definitions of correlation lead to intractable and strongly inapproximable problems. Proof of this inapproximability is significant, since similar problems studied by the computer science theory community have resisted such analysis. One goal of the paper (and for future research) is to develop alternative correlation metrics whose use will both allow efficient search and produce results that are satisfactory for users. (C) 2003 Elsevier Ltd. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available