☆ 4.3 Article

Understanding episode mining techniques: Benchmarking on diverse, realistic, artificial data

INTELLIGENT DATA ANALYSIS (2014)

期刊

INTELLIGENT DATA ANALYSIS

卷 18, 期 5, 页码 761-791

出版社

IOS PRESS

DOI: 10.3233/IDA-140668

关键词

Pattern mining; episode mining; data generation; quality evaluation

类别

Computer Science, Artificial Intelligence

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

Frequent episode mining has been proposed as a data mining task for recovering sequential patterns from temporal data sequences and several approaches have been introduced over the last fifteen years. These techniques have however never been compared against each other in a large scale comparison, mainly because the existing real life data is prevented from entering the public domain by non-disclosure agreements. We perform such a comparison for the first time. To get around the problem of proprietary data, we employ a data generator based on a number of real life observations and capable of generating data that mimics real life data at our disposal. Artificial data offers the additional advantage that the underlying patterns are known, which is typically not the case for real life data. Thus, we can evaluate for the first time the ability of mining approaches to recover patterns that are embedded in noise. Our experiments indicate that temporal constraints are more important in affecting the effectiveness of episode mining than occurrence semantics. They also indicate that recovering underlying patterns when several phenomena are present at the same time is rather difficult and that there is need to develop better significance measures and techniques for dealing with sets of episodes.

Understanding episode mining techniques: Benchmarking on diverse, realistic, artificial data

期刊

INTELLIGENT DATA ANALYSIS

出版社

IOS PRESS

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Understanding episode mining techniques: Benchmarking on diverse, realistic, artificial data

期刊

INTELLIGENT DATA ANALYSIS

出版社

IOS PRESS

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文