3.8 Proceedings Paper

Automated Query Reformulation for Efficient Search based on Query Logs From Stack Overflow

Publisher

IEEE COMPUTER SOC
DOI: 10.1109/ICSE43902.2021.00116

Keywords

Stack Overflow; Data Mining; Query Reformulation; Deep Learning; Query Logs

Funding

  1. National Natural Science Foundation of China [61872263, 61702041, 61202006]
  2. Open Project of State Key Laboratory for Novel Software Technology at Nanjing University [KFKT2019B14]
  3. Australian Research Council [DE180100153]
  4. Australian Research Council [DE180100153] Funding Source: Australian Research Council

Ask authors/readers for more resources

The research found that the difficulty for developers to efficiently search for the information they need on Stack Overflow mainly stems from the gap between user intentions and text meanings, as well as the semantic gap between queries and post content. To address this issue, an automated software-specific query reformulation approach based on deep learning was proposed, which can generate candidate reformulated queries when given the user's original query. Experimental results demonstrated significant improvements in terms of ExactMatch and GLEU.
As a popular Q&A site for programming, Stack Overflow is a treasure for developers. However, the amount of questions and answers on Stack Overflow make it difficult for developers to efficiently locate the information they are looking for. There are two gaps leading to poor search results: the gap between the user's intention and the textual mien. and the semantic gap between the query and the post content. Therefore. developers have to constantly reformulate their queries by correcting misspelled words, adding limitations to certain programming languages or platforms. etc. As query reformulation is tedious for developers, especially for novices, we propose an automated software-specific query reformulation approach based on deep learning. With query logs provided by Stack Overflow, we construct a large-scale query reformulation corpus, including the original queries and corresponding reformulated ones. our approach trains a Transformer model that can automatically generate candidate reformulated queries when given the user's original query. The evaluation results show that our approach outperforms live state-of-the-art baselines and achieves a 5.6% to 33.5% boost in terms of ExactMatch and a 4.8% to 14.4% boost in terms of GLEU.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

3.8
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available