3.8 Proceedings Paper

Exploring API Embedding for API Usages and Applications

Publisher

IEEE
DOI: 10.1109/ICSE.2017.47

Keywords

Word2Vec; API embedding; API usages; migration

Funding

  1. US National Science Foundation (NSF) [CCF-1723215, CCF-1723432, TWC-1723198, CCF-1518897, CNS-1513263]

Ask authors/readers for more resources

Word2Vec is a class of neural network models that as being trained from a large corpus of texts, they can produce for each unique word a corresponding vector in a continuous space in which linguistic contexts of words can be observed. In this work, we study the characteristics of Word2Vec vectors, called API2VEC or API embeddings, for the API elements within the API sequences in source code. Our empirical study shows that the close proximity of the API2VEC vectors for API elements reflects the similar usage contexts containing the surrounding APIs of those API elements. Moreover, API2VEC can capture several similar semantic relations between API elements in API usages via vector offsets. We demonstrate the usefulness of API2VEC vectors for API elements in three applications. First, we build a tool that mines the pairs of API elements that share the same usage relations among them. The other applications are in the code migration domain. We develop API2API, a tool to automatically learn the API mappings between Java and C# using a characteristic of the API2VEC vectors for API elements in the two languages: semantic relations among API elements in their usages are observed in the two vector spaces for the two languages as similar geometric arrangements among their API2VEC vectors. Our empirical evaluation shows that API2API relatively improves 22.6% and 40.1% top-1 and top-5 accuracy over a state-of-the-art mining approach for API mappings. Finally, as another application in code migration, we are able to migrate equivalent API usages from Java to C# with up to 90.6% recall and 87.2% precision.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

3.8
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available