☆ 4.4 Article

Top-k phi Correlation Computation

INFORMS JOURNAL ON COMPUTING (2008)

Journal

INFORMS JOURNAL ON COMPUTING

Volume 20, Issue 4, Pages 539-552

Publisher

INFORMS

DOI: 10.1287/ijoc.1080.0265

Keywords

phi correlation coefficient; Pearson's correlation coefficient; statistical computing

Funding

Rutgers Business School, Newark and New Brunswick
Rutgers University, New Brunswick

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

Recently, there has been considerable interest in efficiently computing strongly correlated pairs in large databases. Most previous studies require the specification of a minimum correlation threshold to perform the computation. However, it may be difficult for users to provide an appropriate threshold in practice because different data sets typically have different characteristics. To this end, in this paper, we propose an alternative task: finding the top-k strongly correlated pairs. Consequently, we identify a two-dimensional monotone property of an upper bound of phi correlation coefficient and develop an efficient algorithm, called TOP-COP, to exploit this property to effectively prune many pairs even without computing their correlation coefficients. Our experimental results show that TOP-COP can be an order of magnitude faster than alternative approaches for mining the top-k strongly correlated pairs. Finally, we show that the performance of the TOP-COP algorithm is tightly related to the degree of data dispersion. Indeed, the higher the degree of data dispersion, the larger the computational savings achieved by the TOP-COP algorithm.

Top-k phi Correlation Computation

Journal

INFORMS JOURNAL ON COMPUTING

Publisher

INFORMS

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Top-k phi Correlation Computation

Journal

INFORMS JOURNAL ON COMPUTING

Publisher

INFORMS

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper