期刊
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING
卷 33, 期 4, 页码 1737-1749出版社
IEEE COMPUTER SOC
DOI: 10.1109/TKDE.2019.2945764
关键词
Microblog retrieval; pseudo-relevance feedback; query expansion; word embeddings
类别
资金
- National Natural Science Foundation of China [61751201]
This paper focuses on enhancing microblog retrieval effectiveness by using local conceptual word embeddings and incorporating strategies such as query expansion and temporal evidences. The proposed approach outperforms baseline methods in terms of understanding information needs and meeting users' real-time information requirements, as demonstrated by experiments on the official TREC Twitter corpora.
Since the length of microblog texts, such as tweets, is strictly limited to 140 characters, traditional Information Retrieval techniques suffer from the vocabulary mismatch problem severely and cannot yield good performance in the context of microblogosphere. To address this critical challenge, in this paper, we focus on the use of local conceptual word embeddings for enhance microblog retrieval effectiveness. In particular, we propose a novel k-Nearest Neighbor (kNN) based Query Expansion (QE) algorithm to generate words from local word embeddings to expand the original query, which leads to better understanding of the information need. Besides, in order to further satisfy users' real-time information need, we incorporate temporal evidences into the expansion algorithm, which can boost recent tweets in the retrieval results with respect to a given topic. Experimental results on the official TREC Twitter corpora demonstrate the significant superiority of our approach over baseline methods.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据