☆ 4.7 Article

WabiQA: A Wikipedia-Based Thai Question-Answering System

INFORMATION PROCESSING & MANAGEMENT (2021)

Journal

INFORMATION PROCESSING & MANAGEMENT

Volume 58, Issue 1, Pages -

Publisher

ELSEVIER SCI LTD

DOI: 10.1016/j.ipm.2020.102431

Keywords

Question Answering System; Deep Learning; Creative Language Processing

Funding

Thailand Science Research and Innovation (TSRI) [RSA6280105]

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

This paper presents a novel system, WabiQA, for automatically answering questions in the Thai language using Thai Wikipedia as the knowledge source. The system won first prize in Thailand's National Software Contest 2019 and outperformed competitors by significant margins in accuracy metrics.

With vast information that has been digitized and made available online, manually finding the answer to a question can be tedious. While search engines have emerged to facilitate information needs, users would have to manually read through the retrieved articles to locate the answer to a specific question. Therefore, the ability to automatically understand users' natural language questions and find the correct answers could prove crucial in information retrieval. Indeed, such automatic question-answering solutions have been extensively studied by the natural language processing (NLP) research communities. However, most of the development targets questions and information sources composed in high-resource languages such as English and Chinese. In this paper, we propose WabiQA, a novel system for automatically answering questions in the Thai language using the Thai Wikipedia articles as the knowledge source. Specifically, the proposed method first retrieves the Wikipedia article that is most likely to contain the answer. Then, a bidirectional LSTM model is used to read the article and locate candidate answers, which are ranked by confidence levels and returned to the user. WabiQA won the first prize award from Thailand's National Software Contest 2019 under category Question-Answering Program from Thai Wikipedia, with 83.5%, 34.80%, and 45.96%, and outperforming the next best competitors' systems by 19.99, 24.26, and 33.10 percentage points in terms of Accuracy@1, EM, and F1 respectively. Furthermore, we also develop a prototype mobile application that aims to facilitate Thai users with visual impairment using voice-to-speech technology and an intelligent question-answer categorization. The findings of this research not only expand the horizon of the possibility to develop intelligent NLP applications for the Thai language using only available existing Thai NLP tools, resources, and deep learning technologies, but also shed light on the possibility to apply such techniques to develop many intelligent NLP tasks for the Thai and other low-resource languages such as reading assessment, writing assistance, and entity linking.

WabiQA: A Wikipedia-Based Thai Question-Answering System

Journal

INFORMATION PROCESSING & MANAGEMENT

Publisher

ELSEVIER SCI LTD

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

WabiQA: A Wikipedia-Based Thai Question-Answering System

Journal

INFORMATION PROCESSING & MANAGEMENT

Publisher

ELSEVIER SCI LTD

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper