☆ 4.6 Article

A hybrid approach of Weighted Fine-Tuned BERT extraction with deep Siamese Bi - LSTM model for semantic text similarity identification

MULTIMEDIA TOOLS AND APPLICATIONS (2022)

期刊

MULTIMEDIA TOOLS AND APPLICATIONS

卷 81, 期 5, 页码 6131-6157

出版社

SPRINGER

DOI: 10.1007/s11042-021-11771-6

关键词

BERT; Bi-LSTM; CNN; NLP; Semantic text-similarity; Embedded vectors; Siamese networks

类别

Computer Science, Information Systems Computer Science, Software Engineering Computer Science, Theory & Methods Engineering, Electrical & Electronic

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

Traditional text-similarity methods require a lot of labeled data and human interventions, and they neglect contextual and word-order information. This study explores the usage of NLP application tasks and deep learning methods for text-similarity detection, and proposes a new hybridized approach that achieves higher accuracy in semantic-text-similarity.

The conventional semantic text-similarity methods requires high amount of trained labeled data and also human interventions. Generally, it neglects the contextual-information and word-orders information resulted in data sparseness problem and latitudinal-explosion issue. Recently, deep-learning methods are used for determining text-similarity. Hence, this study investigates NLP application tasks usage in detecting text-similarity of question pairs or documents and explores the similarity score predictions. A new hybridized approach using Weighted Fine-Tuned BERT Feature extraction with Siamese Bi-LSTM model is implemented. The technique is employed for determining question pair sets using Semantic-text-similarity from Quora dataset. The text features are extracted using BERT process, followed by words embedding with weights. The features along with weight values, are represented as embedded vectors, are subjected to various layers of Siamese Networks. The embedded vectors of input text features were trained by using Deep Siamese Bi-LSTM model, in various layers. Finally, similarity scores are determined for each sentence, and the semantic text-similarity is learned. The performance evaluation of proposed-framework is established with respect to accuracy rate, precision value, F1 score data and Recall values parameters compared with other existing text-similarity detection methods. The proposed-framework exhibited higher efficiency rate with 91% in accuracy level in determining semantic-text-similarity compared with other existing algorithms.

A hybrid approach of Weighted Fine-Tuned BERT extraction with deep Siamese Bi - LSTM model for semantic text similarity identification

期刊

MULTIMEDIA TOOLS AND APPLICATIONS

出版社

SPRINGER

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

A hybrid approach of Weighted Fine-Tuned BERT extraction with deep Siamese Bi - LSTM model for semantic text similarity identification

期刊

MULTIMEDIA TOOLS AND APPLICATIONS

出版社

SPRINGER

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文