4.5 Article

A three-way approach for learning rules in automatic knowledge-based topic models

期刊

出版社

ELSEVIER SCIENCE INC
DOI: 10.1016/j.ijar.2016.12.011

关键词

Topic models; Automatic knowledge-based models; Game-theoretic rough sets; Three-way decisions

向作者/读者索取更多资源

Topic modeling aims to uncover hidden thematic structures in a collection of documents by representing them as a set of topics. Automatic knowledge-based topic models are recently introduced to meet the demands of processing large-scale text collections. They are based on automatic extraction of rules from multiple domain corpuses. Generally, the extracted rules are large in number and some thresholds are used to select only a small number of useful rules. There are two shortcomings in this for selecting important rules. Firstly, they are based on fixed thresholds for extracting rules from all domain corpuses. Secondly, the thresholds are predefined or explicitly set by expert opinions and are not based on automated mechanisms. In this article, we address these shortcomings by considering a three-way approach based on rules having strong positive associations, rules having strong negative associations and rules having weak associations. A pair of thresholds defines and controls the three-way partitioning of the rules. It is argued that the domain specific and automated selection of thresholds in the three-way framework may be approached from the viewpoint of a tradeoff between the quantity of rules and the quality of rules. We apply the game-theoretic rough set (GTRS) model to implement this tradeoff. Algorithms using the GTRS are introduced for automatically determining the thresholds. Experimental results on Chen2014 dataset suggest an average improvement of 52.82 points in topic coherence by increasing the quantity of rules to 17.93%. (C) 2016 Elsevier Inc. All rights reserved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据