4.7 Article

MSC+: Language pattern learning for word sense induction and disambiguation

期刊

KNOWLEDGE-BASED SYSTEMS
卷 188, 期 -, 页码 -

出版社

ELSEVIER
DOI: 10.1016/j.knosys.2019.105017

关键词

Lexical semantics; Information extraction; Linguistic pattern mining; Word sense induction; Word sense disambiguation

资金

  1. CAPES (Brazilian Coordination of Superior Level Staff Improvement) from the Ministry of Education of Brazil
  2. CAPES
  3. Spanish Government under the Maria de Maeztu Units of Excellence Programme [MDM-2015-0502]

向作者/读者索取更多资源

Identifying the correct meaning of words in context or discovering new word senses is particularly useful for several tasks such as question answering, information extraction, information retrieval, and text summarization. However, specially in the context of user-generated contents and on-line communication (e.g. Twitter), new meanings are continuously crafted by speakers as the result of existing words being used in novel contexts. Consequently, lexical semantics inventories and systems have difficulties to cope with semantic drifting problems. In this work, we propose an approach to induce and disambiguate word senses of some target words in collections of short texts, such as tweets, through the use of fuzzy lexico-semantic patterns that we define as sequences of Morpho-semantic Components (MSC). We learn these patterns, that we call MSC+ patterns, from text data automatically. Experimental results show that instances of some MSC+ patterns arise in a number of tweets, but sometimes using different words to convey the sense of the respective MSC in some tweets where pattern instances appear. The exploitation of MSC+ patterns when they induce semantics on target words enable effective word sense disambiguation mechanisms leading to improvements in the state of the art. (C) 2019 Elsevier B.V. All rights reserved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据