☆ 4.7 Article

Intelligent approach to automated star-schema construction using a knowledge base

EXPERT SYSTEMS WITH APPLICATIONS (2021)

期刊

EXPERT SYSTEMS WITH APPLICATIONS

卷 182, 期 -, 页码 -

出版社

PERGAMON-ELSEVIER SCIENCE LTD

DOI: 10.1016/j.eswa.2021.115226

关键词

Data warehouse; Intelligent system; Multidimensional model; Ontology; Semantic approach; Star schema

类别

Computer Science, Artificial Intelligence Engineering, Electrical & Electronic Operations Research & Management Science

资金

Computer Science and Information Technology Department, Science Faculty, Naresuan University [R2564E059, R2564E060]
Health Systems Research Institute [63-017]
Program Management Unit for Human Resources & Institutional Development, Research, and Innovation [B16F630071]
Thailand Science Research Innovation (TSRI) [CU_FRB640001_01_30_1]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

The study introduces a knowledge-based model and framework that can automatically generate star schemas, addressing the challenges in data warehouse construction. By predicting attribute names and data types, it achieves the automated generation of star schemas, outperforming baseline methods.

Most data-warehouse construction processes are performed manually by experts, which is laborious, timeconsuming, and prone to error. Furthermore, special knowledge is required to design complex multidimensional models, such as a star schema. This predicament has motivated computer scientists to propose automation techniques to generate such models. For this reason, we present a new strategy that incorporates knowledgebased models into a framework, named the Semantic-based Star-schema Designer, that assists the automation of star schema construction. Our models provide reasoning capabilities needed by star schema designs, including those that can disambiguate heterogeneous terms, detect appropriate data types and attribute sizes, and organize data hierarchies to support online analytical processes. We also propose strategies to overcome the uncertainty arising when attribute names are not available in the data source. The names of unknown attributes are thus predicted using an arithmetic coding technique to infer column names. Our system also generates star schema from semi-structured data (e.g., comma-separated-value files and spreadsheets), which do not provide primary keys, foreign keys, or relationship cardinalities between tables. Our framework facilitates star schema construction and their relationship information without human intervention using homegrown algorithms. Experiments demonstrate that our technique predicts column names and data types that enable the effective generation of star schema better than baseline approaches.

Intelligent approach to automated star-schema construction using a knowledge base

期刊

EXPERT SYSTEMS WITH APPLICATIONS

出版社

PERGAMON-ELSEVIER SCIENCE LTD

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Intelligent approach to automated star-schema construction using a knowledge base

期刊

EXPERT SYSTEMS WITH APPLICATIONS

出版社

PERGAMON-ELSEVIER SCIENCE LTD

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文