☆ 4.6 Article

Sentence transition matrix: An efficient approach that preserves sentence semantics

COMPUTER SPEECH AND LANGUAGE (2022)

期刊

COMPUTER SPEECH AND LANGUAGE

卷 71, 期 -, 页码 -

出版社

ACADEMIC PRESS LTD- ELSEVIER SCIENCE LTD

DOI: 10.1016/j.csl.2021.101266

关键词

Sentence embedding; Sentence semantics; Transition matrix; Paraphrase; Natural language processing

类别

Computer Science, Artificial Intelligence

资金

National Research Foundation of Korea (NRF) - Korea government (MSIT) [NRF-2019R1F1A1060338]
Korea Institute for Advancement of Technology (KIAT) - Korea Government (MOTIE) [P0008691]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

Sentence embedding is a significant research topic in natural language processing (NLP), aiming to generate vectors that represent the intrinsic meaning of sentences and improve performance in NLP tasks. Various approaches have been proposed and evaluated using semantic textual similarity (STS) tasks, with supervised neural network-based models delivering state-of-the-art performance. However, these models have limitations in terms of learnable parameters and the required amount of labeled training data. Pretrained language model-based approaches have emerged as a dominant trend, but acquiring sufficient labeled data is still necessary for fine-tuning.

Sentence embedding is an influential research topic in natural language processing (NLP). Generation of sentence vectors that reflect the intrinsic meaning of sentences is crucial for improving performance in various NLP tasks. Therefore, numerous supervised and unsupervised sentence-representation approaches have been proposed since the advent of the distributed representation of words. These approaches have been evaluated on semantic textual similarity (STS) tasks designed to measure the degree of semantic information preservation; neural network-based supervised embedding models typically deliver state-of-the-art performance. However, these models have limitations in that they have numerous learnable parameters and thus require large amounts of specific types of labeled training data. Pretrained language modelbased approaches, which have become a predominant trend in the NLP field, alleviate this issue to some extent; however, it is still necessary to collect sufficient labeled data for the fine-tuning process is still necessary. Herein, we propose an efficient approach that learns a transition matrix tuning a sentence embedding vector to capture the latent semantic meaning. Our proposed method has two practical advantages: (1) it can be applied to any sentence embedding method, and (2) it can deliver robust performance in STS tasks with only a few training examples.

Sentence transition matrix: An efficient approach that preserves sentence semantics

期刊

COMPUTER SPEECH AND LANGUAGE

出版社

ACADEMIC PRESS LTD- ELSEVIER SCIENCE LTD

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Sentence transition matrix: An efficient approach that preserves sentence semantics

期刊

COMPUTER SPEECH AND LANGUAGE

出版社

ACADEMIC PRESS LTD- ELSEVIER SCIENCE LTD

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文