4.0 Article

Multi-label classification of computer science documents using fuzzy logic

出版社

NATL SCIENCE FOUNDATION SRI LANKA
DOI: 10.4038/jnsfsr.v44i2.7996

关键词

Category identification; document classification; fuzzy rules; research paper classification

向作者/读者索取更多资源

Classification has been already used for the prediction of predefined topics in many diversified domains including research paper classification task. A research paper may belong to one or more than one topic (classes). The state-of-the-art techniques in this area have the following limitations such as: (1) most of the techniques classify documents to at most one principal topic and do not identify all of the topic associations for research papers, (2) considers the classification problem of research documents in discrete domain and the accuracy of these techniques remain low when considering multiple classes for a single document. These limitations led us to explore the fuzzy domain for the classification of Computer Science documents because we are not sure whether the documents belong to one category or more than one category. Furthermore, fuzzy classification will help to identify the degree to which papers belong to different topics. To validate the findings of our research, we need a comprehensive dataset. Such a dataset has been made available by the scientific community for Computer Science domain. Therefore, in this paper, we restrict our focus to the Computer Science domain. Key features are extracted from the Title and Keywords of the research paper. We used term frequency (TF) as the weight scoring methodology. As a paper may belong to more than one category, we used fuzzy classifier, which automatically identifies all possible categories. Subsequently based on a threshold, the final one or more than one topic is assigned. We propose a generic framework and two algorithms for category (ies) identification. Our rules have been evolved (updated) by rules updater after the classification has been done by the fuzzy classifier. Performance of the technique with respect to accuracy has been compared with different classification techniques. The proposed approach has outperformed the state-of-the-art approaches.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.0
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据