4.6 Article

Comparison of interestingness measures for web usage mining: An empirical study

出版社

WORLD SCIENTIFIC PUBL CO PTE LTD
DOI: 10.1142/S0219622007002368

关键词

web log mining; interestingness measures; association rule mining; sequential pattern mining

向作者/读者索取更多资源

A common problem in mining association rules or sequential patterns is that a large number of rules or patterns can be generated from a database, making it impossible for a human analyst to digest the results. Solutions to the problem include, among others, using interestingness measures to identify interesting rules or patterns and pruning rules that are considered redundant. Various interestingness measures have been proposed, but little work has been reported on the effectiveness of the measures on real-world applications. We present an application of Web usage mining to a large collection of Livelink log data. Livelink is a web-based product of Open Text Corporation, which provides automatic management and retrieval of different types of information objects over an intranet, an extranet or the Internet. We report our experience in preprocessing raw log data, mining association rules and sequential patterns from the log data, and identifying interesting rules and patterns by use of interestingness measures and some pruning methods. In particular, we evaluate a number of interestingness measures in terms of their effectiveness in finding interesting association rules and sequential patterns. Our results show that some measures are much more effective than others.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据