4.6 Article

e-RNSP: An Efficient Method for Mining Repetition Negative Sequential Patterns

期刊

IEEE TRANSACTIONS ON CYBERNETICS
卷 50, 期 5, 页码 2084-2096

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TCYB.2018.2869907

关键词

Data mining; Databases; Insurance; Companies; DNA; Medical services; Automobiles; Negative sequential patterns (NSPs); repetition NSPs (RNSPs); repetition patterns; sequence analysis

资金

  1. National Natural Science Foundation of China [71271125, 61502260]
  2. Natural Science Foundation of Shandong Province, China [ZR2018MF011]
  3. Australian Research Council [DP130102691]

向作者/读者索取更多资源

Negative sequential patterns (NSPs), which capture both frequent occurring and nonoccurring behaviors, become increasingly important and sometimes play a role irreplaceable by analyzing occurring behaviors only. Repetition sequential patterns capture repetitions of patterns in different sequences as well as within a sequence and are very important to understand the repetition relations between behaviors. Though some methods are available for mining NSP and repetition positive sequential patterns (RPSPs), we have not found any methods for mining repetition NSP (RNSP). RNSP can help the analysts to further understand the repetition relationships between items and capture more comprehensive information with repetition properties. However, mining RNSP is much more difficult than mining NSP due to the intrinsic challenges of nonoccurring items. To address the above issues, we first propose a formal definition of repetition negative containment. Then, we propose a method to convert repetition negative containment to repetition positive containment, which fast calculates the repetition supports by only using the corresponding RPSP's information without rescanning databases. Finally, we propose an efficient algorithm, called e-RNSP, to mine RNSP efficiently. To the best of our knowledge, e-RNSP is the first algorithm to efficiently mine RNSP. Intensive experimental results on the first four real and synthetic datasets clearly show that e-RNSP can efficiently discover the repetition negative patterns; results on the fifth dataset prove the effectiveness of RNSP which are captured by the proposed method; and the results on the rest 16 datasets analyze the impacts of data characteristics on mining process.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据