4.5 Article

Detecting coherent explorations in SQL workloads

期刊

INFORMATION SYSTEMS
卷 92, 期 -, 页码 -

出版社

PERGAMON-ELSEVIER SCIENCE LTD
DOI: 10.1016/j.is.2019.101479

关键词

-

向作者/读者索取更多资源

This paper presents a proposal aiming at better understanding a workload of SQL queries and detecting coherent explorations hidden within the workload. In particular, our work investigates SQLShare (Jain et al., 2016), a database-as-a-service platform targeting scientists and data scientists with minimal database experience, whose workload was made available to the research community. According to the authors of (Jain et al., 2016), this workload is the only one containing primarily ad-hoc handwritten queries over user-uploaded datasets. We analyzed this workload by extracting features that characterize SQL queries and we investigate three different machine learning approaches to use these features to separate sequences of SQL queries into meaningful explorations. The first approach is unsupervised and based only on similarity between contiguous queries. The second approach uses transfer learning to apply a model trained over a dataset where ground truth is available. The last approach uses weak labeling to predict the most probable segmentation from heuristics meant to label a training set. We ran several tests over various query workloads to evaluate and compare the proposed methods. (C) 2019 Elsevier Ltd. All rights reserved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据