☆ 4.5 Article

Multi-Temperate Logical Data Warehouse Design for Large-Scale Healthcare Data

BIG DATA RESEARCH (2021)

Journal

BIG DATA RESEARCH

Volume 25, Issue -, Pages -

Publisher

ELSEVIER

DOI: 10.1016/j.bdr.2021.100255

Keywords

Data warehouse design; OLAP workloads; Healthcare data management; Data partitioning algorithms; Logical data warehouses; Columnar databases

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

Advancements in modern hardware architectures and database technology have led to the increased adoption of logical data warehouses (LDWs) as complements to traditional physical data warehousing (PDW) approaches. LDWs allow for integration and transformation of data at run-time, with a focus on replicating high value data to physical core for spatial locality in premium hardware environments. This study explores the support and evaluation of LDW design algorithms in multi-temperature storage systems.

Modern hardware architectures and advances in database technology are driving increased adoption of logical data warehouses (LDWs) that complement traditional physical data warehousing (PDW) approaches. In contrast to PDW design methodologies that emphasize physical consolidation of all data of interest on a single (perhaps distributed) computing platform, along with early-binding approaches that pre-materialize transformations and changes to the source data, LDW techniques allow for the integration and transformation of data at run-time and typically physically move or modify much less data in advance. In an environment with premium hardware such as multi-temperate storage, the successful design of LDWs depends on replication of high value data to their physical core to maximize spatial locality. Identifying and collocating high value data is a non-trivial task that has not been adequately explored in the context of LDWs in multi-temperate storage systems. In this paper, we gather queries to construct an OLAP workload for use in supporting and evaluating LDW design algorithms for a large healthcare organization. We introduce new algorithms to address the preprocessing of the workload, identification of data clusters to support OLAP queries, and assignment of clusters to appropriate (hot, warm, and cold) storage tiers, allowing the LDW to deliver results more efficiently by covering a higher percentage of its query workload using the fastest storage devices. Any use case involving copying data from sources to tiered storage targets for analytic querying could benefit from the techniques and solutions presented here. (C) 2021 Elsevier Inc. All rights reserved.

Multi-Temperate Logical Data Warehouse Design for Large-Scale Healthcare Data

Journal

BIG DATA RESEARCH

Publisher

ELSEVIER

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Multi-Temperate Logical Data Warehouse Design for Large-Scale Healthcare Data

Journal

BIG DATA RESEARCH

Publisher

ELSEVIER

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper