☆ 4.7 Article

Computing semantic similarity based on novel models of semantic representation using Wikipedia

INFORMATION PROCESSING & MANAGEMENT (2018)

期刊

INFORMATION PROCESSING & MANAGEMENT

卷 54, 期 6, 页码 1002-1021

出版社

ELSEVIER SCI LTD

DOI: 10.1016/j.ipm.2018.07.002

关键词

Semantic similarity; Concept similarity; Information content; Feature-based methods; Wikipedia

类别

Computer Science, Information Systems Information Science & Library Science

资金

National Natural Science Foundation of China [61772210, 61272066]
Project of Science and Technology in Guangzhou in China [201807010043]
key project in universities in Guangdong Province of China [2016KZDXM024]
Innovation project of postgraduate education in Guangdong Province of China [2016SFKC_13]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

Computing Semantic Similarity (SS) between concepts is one of the most critical issues in many domains such as Natural Language Processing and Artificial Intelligence. Over the years, several SS measurement methods have been proposed by exploiting different knowledge resources. Wikipedia provides a large domain-independent encyclopedic repository and a semantic network for computing SS between concepts. Traditional feature-based measures rely on linear combinations of different properties with two main limitations, the insufficient information and the loss of semantic information. In this paper, we propose several hybrid SS measurement approaches by using the Information Content (IC) and features of concepts, which avoid the limitations introduced above. Considering integrating discrete properties into one component, we present two models of semantic representation, called CORM and CARM. Then, we compute SS based on these models and take the IC of categories as a supplement of SS measurement. The evaluation, based on several widely used benchmarks and a benchmark developed by ourselves, sustains the intuitions with respect to human judgments. In summary, our approaches are more efficient in determining SS between concepts and have a better human correlation than previous methods such as Word2Vec and NASARI.

Computing semantic similarity based on novel models of semantic representation using Wikipedia

期刊

INFORMATION PROCESSING & MANAGEMENT

出版社

ELSEVIER SCI LTD

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Computing semantic similarity based on novel models of semantic representation using Wikipedia

期刊

INFORMATION PROCESSING & MANAGEMENT

出版社

ELSEVIER SCI LTD

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文