4.2 Article

Contextual Word2Vec Model for Understanding Chinese Out of Vocabularies on Online Social Media

出版社

IGI GLOBAL
DOI: 10.4018/IJSWIS.309428

关键词

Out of Vocabulary (OOV); Social Media; Word Embedding; Word2Vec

资金

  1. National Research Foundation of Korea (NRF) - Korean government (MSIP) [NRF-2020R1A2B5B01002207, NRF-2021R1I1A1A01060302]

向作者/读者索取更多资源

This chapter proposes the use of a contextual Word2Vec model for understanding OOV. The authors extract the OOV using left-right entropy and point information entropy. They construct a word vector space using Word2Vec and obtain contextual information using CBOW. The results show that the proposed model achieves a higher accuracy rate than Skip-Gram.
In this chapter, the authors propose to use contextual Word2Vec model for understanding OOV (out of vocabulary). The OOV is extracted by using left-right entropy and point information entropy. They choose to use Word2Vec to construct the word vector space and CBOW (continuous bag of words) to obtain the contextual information of the words. If there is a word that has similar contextual information to the OOV, the word can be used to understand the OOV. They chose the Weibo corpus as the dataset for the experiments. The results show that the proposed model achieves 97.10% accuracy, which is better than Skip-Gram by 8.53%.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.2
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据