4.5 Article

A new network model for extracting text keywords

期刊

SCIENTOMETRICS
卷 116, 期 1, 页码 339-361

出版社

SPRINGER
DOI: 10.1007/s11192-018-2743-5

关键词

Keyword extraction; Complex network; Synthetic eigenvalue; Text keyword; Network theory

资金

  1. National Natural Science Foundation of China [U1434209]
  2. National Key Research and Development Program of China [2017YFB1201105]
  3. Research Foundation of State Key Laboratory of Railway Traffic Control and Safety [RCS2018ZZ003]

向作者/读者索取更多资源

Text keywords are defined as meaningful and important words in a document, which provide a precise overview of its content and reflect the author's writing intention. Keyword extraction methods have received a lot of attentions, among which is the network-based method. However, existing network-based keyword extraction methods only consider the connections between words in a document, while ignoring the impact of sentences. Since a sentence is made of many words, while words affect one another in a sentence, neglecting the influence of sentences will result in the loss of information. In this paper, we introduce a word network whose nodes represent words in a document, and define that any keyword extraction method based on a word network is called as a Word-net method. Then, we propose a new network model which considers the influence of sentences, and a new word-sentence method based on the new model. Experimental results demonstrate that our method outperforms the Word-net method, the classical term frequency-inverse document frequency (TF-IDF) method, most frequent method and TextRank method. The precision, recall, and F-measure of our result are respectively 7.95, 8.27 and 6.54% higher than the Word-net result, and the average precision of our result is 17.56% higher than the TF-IDF result. A two-way analysis of variance is employed to validate the empirical analysis, which indicates that keyword extraction methods and keyword numbers have statistically significant effects on the evaluation of metric values.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据