☆ 4.6 Article

PMIVec: a word embedding model guided by point-wise mutual information criterion

MULTIMEDIA SYSTEMS (2022)

Journal

MULTIMEDIA SYSTEMS

Volume 28, Issue 6, Pages 2275-2283

Publisher

SPRINGER

DOI: 10.1007/s00530-022-00928-4

Keywords

Natural language processing; Word embedding; Point-wise mutual information

Funding

NSFC [61836011, U20B2070, 61976199]

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

Word embedding represents words with dense vectors to show the semantic similarity. This paper introduces a novel method, PMIVec, which learns context vectors to represent words and uses point-wise mutual information to measure semantic similarity. Experimental results demonstrate that PMIVec outperforms state-of-the-art models consistently.

Word embedding aims to represent each word with a dense vector which reveals the semantic similarity between words. Existing methods such as word2vec derive such representations by factorizing the word-context matrix into two parts, i.e., word vectors and context vectors. However, only one part is used to represent the word, which may damage the semantic similarity between words. To address this problem, this paper proposes a novel word embedding method based on point-wise mutual information criterion (PMIVec). Our method explicitly learns the context vector as the final word representation for each word, while discarding the word vector. To avoid the damage of semantic similarity between words, we normalize the word vector during the training process. Moreover, this paper uses point-wise mutual information to measure the semantic similarity between words, which is more consistent with human intuition on semantic similarity. Experiments on public data sets show that our PMIVec model can consistently outperform state-of-the-art models.

PMIVec: a word embedding model guided by point-wise mutual information criterion

Journal

MULTIMEDIA SYSTEMS

Publisher

SPRINGER

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

PMIVec: a word embedding model guided by point-wise mutual information criterion

Journal

MULTIMEDIA SYSTEMS

Publisher

SPRINGER

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper