☆ 3.8 Proceedings Paper

Automated Query Reformulation for Efficient Search based on Query Logs From Stack Overflow

2021 IEEE/ACM 43RD INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2021) (2021)

Journal

2021 IEEE/ACM 43RD INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2021)

Volume -, Issue -, Pages 1273-1285

Publisher

IEEE COMPUTER SOC

DOI: 10.1109/ICSE43902.2021.00116

Keywords

Stack Overflow; Data Mining; Query Reformulation; Deep Learning; Query Logs

Funding

National Natural Science Foundation of China [61872263, 61702041, 61202006]
Open Project of State Key Laboratory for Novel Software Technology at Nanjing University [KFKT2019B14]
Australian Research Council [DE180100153]
Australian Research Council [DE180100153] Funding Source: Australian Research Council

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

The research found that the difficulty for developers to efficiently search for the information they need on Stack Overflow mainly stems from the gap between user intentions and text meanings, as well as the semantic gap between queries and post content. To address this issue, an automated software-specific query reformulation approach based on deep learning was proposed, which can generate candidate reformulated queries when given the user's original query. Experimental results demonstrated significant improvements in terms of ExactMatch and GLEU.

As a popular Q&A site for programming, Stack Overflow is a treasure for developers. However, the amount of questions and answers on Stack Overflow make it difficult for developers to efficiently locate the information they are looking for. There are two gaps leading to poor search results: the gap between the user's intention and the textual mien. and the semantic gap between the query and the post content. Therefore. developers have to constantly reformulate their queries by correcting misspelled words, adding limitations to certain programming languages or platforms. etc. As query reformulation is tedious for developers, especially for novices, we propose an automated software-specific query reformulation approach based on deep learning. With query logs provided by Stack Overflow, we construct a large-scale query reformulation corpus, including the original queries and corresponding reformulated ones. our approach trains a Transformer model that can automatically generate candidate reformulated queries when given the user's original query. The evaluation results show that our approach outperforms live state-of-the-art baselines and achieves a 5.6% to 33.5% boost in terms of ExactMatch and a 4.8% to 14.4% boost in terms of GLEU.

Automated Query Reformulation for Efficient Search based on Query Logs From Stack Overflow

Journal

2021 IEEE/ACM 43RD INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2021)

Publisher

IEEE COMPUTER SOC

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Automated Query Reformulation for Efficient Search based on Query Logs From Stack Overflow

Journal

2021 IEEE/ACM 43RD INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2021)

Publisher

IEEE COMPUTER SOC

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper