☆ 4.6 Article

Word Sense Disambiguation Using Prior Probability Estimation Based on the Korean WordNet

ELECTRONICS (2021)

期刊

ELECTRONICS

卷 10, 期 23, 页码 -

出版社

MDPI

DOI: 10.3390/electronics10232938

关键词

word sense disambiguation; Korean WordNet; knowledge-based model; data mining; information extraction

类别

Computer Science, Information Systems Engineering, Electrical & Electronic Physics, Applied

资金

Institute of Information & communications Technology Planning & Evaluation (IITP) - Korea government(MSIT) [2020-0-01450]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

This study proposes an unsupervised disambiguation method based on the Korean WordNet, which outperforms supervised disambiguation methods by calculating the chi(2) statistic between related words to resolve the data deficiency problem.

Supervised disambiguation using a large amount of corpus data delivers better performance than other word sense disambiguation methods. However, it is not easy to construct large-scale, sense-tagged corpora since this requires high cost and time. On the other hand, implementing unsupervised disambiguation is relatively easy, although most of the efforts have not been satisfactory. A primary reason for the performance degradation of unsupervised disambiguation is that the semantic occurrence probability of ambiguous words is not available. Hence, a data deficiency problem occurs while determining the dependency between words. This paper proposes an unsupervised disambiguation method using a prior probability estimation based on the Korean WordNet. This performs better than supervised disambiguation. In the Korean WordNet, all the words have similar semantic characteristics to their related words. Thus, it is assumed that the dependency between words is the same as the dependency between their related words. This resolves the data deficiency problem by determining the dependency between words by calculating the chi(2) statistic between related words. Moreover, in order to have the same effect as using the semantic occurrence probability as prior probability, which is used in supervised disambiguation, semantically related words of ambiguous vocabulary are obtained and utilized as prior probability data. An experiment was conducted with Korean, English, and Chinese to evaluate the performance of our proposed lexical disambiguation method. We found that our proposed method had better performance than supervised disambiguation methods even though our method is based on unsupervised disambiguation (using a knowledge-based approach).

Word Sense Disambiguation Using Prior Probability Estimation Based on the Korean WordNet

期刊

ELECTRONICS

出版社

MDPI

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Word Sense Disambiguation Using Prior Probability Estimation Based on the Korean WordNet

期刊

ELECTRONICS

出版社

MDPI

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文