4.6 Article

Expanding Queries for Code Search Using Semantically Related API Class-names

Journal

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING
Volume 44, Issue 11, Pages 1070-1082

Publisher

IEEE COMPUTER SOC
DOI: 10.1109/TSE.2017.2750682

Keywords

Query expansion; code search; neural network language model; API class-name

Ask authors/readers for more resources

When encountering unfamiliar programming tasks (e.g., connecting to a database), there is a need to seek potential working code examples. Instead of using code search engines, software developers usually post related programming questions on online Q&A forums (e.g., Stack Overflow). One possible reason is that existing code search engines would return effective code examples only if a query contains identifiers (e.g., class or method names). In other words, existing code search engines do not handle natural-language queries well (e.g., a description of a programming task). However, developers may not know the appropriate identifiers at the time of the search. As the demand of searching code examples is increasing, it is of significant interest to enhance code search engines. We conjecture that expanding natural-language queries with their semantically related identifiers has a great potential to enhance code search engines. In this paper, we propose an automated approach to find identifiers (in particular API class-names) that are semantically related to a given natural-language query. We evaluate the effectiveness of our approach using 74 queries on a corpus of 23,677,216 code snippets that are extracted from 24,666 open source Java projects. The results show that our approach can effectively recommend semantically related API class-names to expand the original natural-language queries. For instance, our approach successfully retrieves relevant code examples in the top 10 retrieved results for 76 percent of 74 queries, while it is 36 percent when using the original natural-language query; and the median rank of the first relevant code example is increased from 22 to 7.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available