☆ 4.1 Article

Robust semantic text similarity using LSA, machine learning, and linguistic resources

LANGUAGE RESOURCES AND EVALUATION (2016)

期刊

LANGUAGE RESOURCES AND EVALUATION

卷 50, 期 1, 页码 125-161

出版社

SPRINGER

DOI: 10.1007/s10579-015-9319-2

关键词

Latent semantic analysis; WordNet; Term alignment; Semantic similarity

类别

Computer Science, Interdisciplinary Applications

资金

US National Science Foundation [1228198, 1250627, 0910838]
Direct For Computer & Info Scie & Enginr
Division Of Computer and Network Systems [1228673] Funding Source: National Science Foundation
Direct For Computer & Info Scie & Enginr
Div Of Information & Intelligent Systems [1250627, 0910838] Funding Source: National Science Foundation
Division Of Computer and Network Systems
Direct For Computer & Info Scie & Enginr [1228198] Funding Source: National Science Foundation

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

Semantic textual similarity is a measure of the degree of semantic equivalence between two pieces of text. We describe the SemSim system and its performance in the *SEM 2013 and SemEval-2014 tasks on semantic textual similarity. At the core of our system lies a robust distributional word similarity component that combines latent semantic analysis and machine learning augmented with data from several linguistic resources. We used a simple term alignment algorithm to handle longer pieces of text. Additional wrappers and resources were used to handle task specific challenges that include processing Spanish text, comparing text sequences of different lengths, handling informal words and phrases, and matching words with sense definitions. In the *SEM 2013 task on Semantic Textual Similarity, our best performing system ranked first among the 89 submitted runs. In the SemEval-2014 task on Multilingual Semantic Textual Similarity, we ranked a close second in both the English and Spanish subtasks. In the SemEval-2014 task on Cross-Level Semantic Similarity, we ranked first in Sentence-Phrase, Phrase-Word, and Word-Sense subtasks and second in the Paragraph-Sentence subtask.

Robust semantic text similarity using LSA, machine learning, and linguistic resources

期刊

LANGUAGE RESOURCES AND EVALUATION

出版社

SPRINGER

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Robust semantic text similarity using LSA, machine learning, and linguistic resources

期刊

LANGUAGE RESOURCES AND EVALUATION

出版社

SPRINGER

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文