4.8 Article

Arabic duplicate questions detection based on contextual representation, class label matching, and structured self attention

出版社

ELSEVIER
DOI: 10.1016/j.jksuci.2020.11.032

关键词

Duplicate questions detection; Arabic question answering; Contextual embedding; ELMo; Neural attention mechanism; Text classification

向作者/读者索取更多资源

This paper presents a duplicate question detection method based on contextual word representation, question classification, and self-attention, achieving good performance in experiments.
Question Answering Systems (QAS) are rising solutions providing exact and precise answers to natural questions. Duplicate Question Detection (DQD), which aims to reuse previous answers, has shown its ability to improve user experience and reduce significantly the response time. However, few Arabic QAS integrate solutions able to detect duplicate questions in their workflow. In this paper, we build a DQD method based on contextual word representation, question classification and forward/backward structured self attention. First, we extract contextual word representation Embeddings from Language Models (ELMo) to map questions into a vector space. Next, we train two models to classify question embedding according to two taxonomies: Hamza et al. and Li & Roth. Then, we introduce a class label matching step to filter out questions that have different class labels. Finally, we propose a Bidirectional Attention Bidirectional LSTM (BiAttention BiLSTM) model that focuses only on keywords to predict whether a question pair is a duplicate or not. We also apply a data augmentation strategy based on symmetry, reflexivity, and transitivity relations to improve the generalization of our model. Various experimentations are performed to evaluate the impact of question classification and pre-processing step on DQD model. The obtained results show that our model achieves good performances as compared to the baseline results. (C) 2020 The Authors. Published by Elsevier B.V. on behalf of King Saud University.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.8
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据